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MUCIN-DERIVED PROTEINS FOR THE DtAONOSIS. IMAGING. AND THERAPY OF HUMAN 
CANCER 



TECHNICAL FIELD 
The present invention relates to a newly-discovered 
group of protein products of the MUCl gene and diagnostic 
and therapeutic methods for utlising the same, as well as 
■diagnostic and therapeutic compositions containing the 
same. 



BACKGKOUND OF THE IMVENTIOH 
Polymorphic, high inoleciilar weight glycoproteins are 
abundantly expressed in human breast carcinomas. These 
proteins, designated MUCl (also referred to as episialin, 
H23Ag, PEM, EMA, CA15-3, MCA, etc.) are heavily glycosylated 
with 0-glycosidic-linked carbohydrate side chains, and, as 
such, have itiucin-like characteristics [for review, see J- 
Hilkens, et al-, "Cell Membrane-Associated Mucins and Their 
Adhesion Modulating Property," TIBS , Vol. 17, pp. 359-3S3 
(1992)]. Although MUCl proteins are expressed at basal 
levels by most secretory epithelial tissues, their 
expression is dramatically increased in malignant breast 
epithelial cells [S.X. Xing, et al., "Reactivity of 
Anti-Human Milk Fat Globule Antibodies with Synthetic 
Peptides," J. Immunol. , vol. 142, pp. 3503-3509 (L989)]. The 
fact that disease status in breast cancer patients is 
routinely assessed by monitoring the serum levels of 
circulating tandem repeat array containing MUCl protein, 
using commercial assays such as CA15-3 and MCA Imammary 
carcinoma antigenl underscores the unequivocal impoirtance of 
MUCl gene expression to human breast cancer. That increased 
MUCl expression may reflect a change in the differentiation 
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Status of the malignant epithelial cells is indicated by 
high levels of »5UC1 expression also in lactating mammary 
epithelial tissue ^ where it is localized at the apical 
surfaces. Due to the loss of cellular architecture in 
breast cancer tissue, MUCl is no longer expressed solely on 
the apical surface and this, in conjunction with the finding 
that MUCl expresssiort reduces cell-cell adhesion [M.J.L. 
Ligtenherg, et al., "Suppression of Cellular Aggregation by 
High Levels of Episialin," Cancer Res . , Vol. 52, 
pp. 2318-2324 (1992]] r lEay enhance the invasiveness of the 
breast cancer cell. 

Molecular studies, including cDNA and gene cloning, 
have elucidated many properties of the MUCl proteins 
[D.H, Wreschner, et al., "Isolation and Characterization of 
Full Length cDNA Coding for the H23 Breast Tumor Associated 
Antigen," in Breast Cancer; Progress in Biology, Clinical 
ManageiTient and Prevention , KE.A. Rich, J.C. Hager and I. 
Keydar, Eds., Kluwer Academic Publishers, Boston, Mass., 
U.S.A., pp. 41-59 (1989); D.H. Wreschner, et al., "Human 
Epithelial Tumor Antigen cDHA Sequences - Differential 
Splicing May Generate Kultiple Protein Forms," Eur. J. 
■ Biochem. , Vol. 189, pp. 463-473 (1990) 1- The MUCl gene 
product best characterized so far is a polymorphic, type 1 
transmembrane molecule that consists of a large 
extracellular domain, a transmembrane domain and a 69 amino 
acid cytoplasmic tail. The genetic polymorphism derives from 
a 20 amino acid repeat motif rich in serine, threonine and 
proline residues, that varies in number from approximately 
20 to 100 repeats. The feature of a tandemly repeating 
domain is shared by all cloned human, porcine and Xenopus 
mucins (MUC2, MUC3, human tracheobronchial mucin MUC4, MUC5, 
porcine submaxillary mucin and Xenopus integumentary mucin) . 
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This common property notwithstanding, several unique 
features distinguish the MUCl proteins from the other 
Biucins, First, whereas the latter mucins have several 
cysteine residues in their extracellular domains that form 
disulfide bridges, thereby generating a mujcin network, the 
MUCl proteins have no cysteine residues in their 
extracellular domain, and thus are less likely to have this 
mesh-forming capability. Second, and perhaps most 
significantly, the MUCl protein is a type 1 transmembrane 
protein, a molecular structure not shared by the other nsucin 
molecules, that are secreted from the cell. 

insights into the function of MUGl gene products have 
been furnished by analyzing the phenotype of tandem repeat 
array containing transmembrane MUCl transf ectants- This has 
shown that MUCl expression reduces cellular adhesion 
ILigtenberg, et al.. Cancer Res . , ibid. (1992)]- 
Interestingly, a comparison of the human MUCl amino acid 
sequence with the mouse MUCl horoologue [A. P. Spicer, et al. , 
•^Molecular Cloning of the Kouse Homologue of the Tumor 
Associated Mucin, MUCl, Reveals Conservation of Potential 
O-Glycosylation Sites, Tremsmerabrane and Cytoplasmic Domains 
and a Loss of Minisatellite-Like Polymorphism," J. Biol, 
chem. , Vol. 266, pp- 15099-1S109 11991)3 shows that whereas 
a tandem repeat structure rich in serine and threonine 
residues is also observed in the mouse protein, there is 
very little conservation of actual amino acid sequence in 
this region. This indicates that perhaps the primary 
function of mucin tandemly repeated domains is to provide 
the "infrastructure" for extensive O-linked glycosylation, 
thereby conferring to the molecule its anti-adhesion 
function. Recent experiments have indeed shown that the 
tandem repeat array mediates this anti-adhesive feature of 
MUCl protein. . 
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AS described above, expression of tlie polymorphic MTJCl 
proteins reduces cellular aggregation potential, suggesting 
that MUCl interference with cellular interactions may be 
critical in tissue morphogenesis such as ductal development 
by glandular epithelial cells in normal tissues [J. Hilkens, 
et al., ibid., (1992)], and could be responsible for the 
detachment of tiimor cells from malignant tissues where it is 
expressed at high levels [Ligtenberg, et al.. Cancer Res. , 
ibid. (1992)]. 

Comparison of MUCl segusnces in different species may 
provide additional insights into functionally important 
regions of MUCl gene products. For example, the mouse MUCl 
homologue shows, in contrast to the lack of similarity 
within the tandem repeating sequence, a very high degree of 
amino acid sequence conservation with human MUCl, in the 
cytoplasmic and transmembrane domains as well as in the 120 
amino acids N-termihal to the transmembrane domain. This 
degree of amino acid sequence similarity is almost 90% in 
the cytoplasmic and transmembrane domains, indicating that 
these regions, as well as the 120 amino acids N-terminally 
adjacent to the transmembrane domain, may be functionally 
■ very important. This contrasts with the lack of 
inter-species conservation of the MUCl tandem repeat array 
amino acid sequence, thereby suggesting that distinct 
functions may be performed by the tandem repeat array and by 
the other highly-conserved' regions of the MUCl proteins. 

SUMMARY OF THE IMVEHTlON 
According to the present invention, there has now been 
identified and characteriaed a group of novel protein 
products of the MUCl gene. 
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More particularly, the present invention relates to 
novel proteins designated herein as MUCl/X, MUCl/£/alt, 
MUCl/Y, MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, 
MUCl/Z and MUCl/Z/alt, wliich function as receptor proteins 
and activating ligands for said receptors in human breast 
cancer cells, and which proteins are all characterised by 
the absence of the characteristic MUCl protein tandem repeat 
array. 

Thus, according to the present invention, there is now 
provided a biochemically pure HUCl protein, selected from 
the group consisting of MUCl/X, MUCl/X/alt, MUCl/Y, 
MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z, 
and MUCl/2/alt, or a functional derivative thereof, devoid 
of a tandem repeat array. 

The term "functional derivative" as used herein is 
intended to include labelled proteins, conjugated proteins, 
fused chimeric proteins and purified receptors in soluble 
form, as well as fragments, deletions, and conservative 
substitutions of said proteins. 

As will be realized, the biochemically pure mcl 
proteins as defined and clainsed herein are isolated and 
purified and are thus substantially free of natural 
contaminants . 

The term "conservative substitutions" as used herein is 
intended to denote substitutions which preserve the activity 
of the defined proteins, involving between 80% to 90% 

■.conservation. 

More specifically, the present invention provides a 
biochemically pure MUCl protein selected from the group 
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consisting of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, 
MUCl/W, MUCl/W/alt, MUCl/2, and MUCl/Z/alt, or a functional 
derivative thereof, comprising a partial amino acid 
sequence: 

MTPGTQSPFPLLLLLTVLT lATTAPKPAT] 

VVTGSG. hasstp(;geketsatqrssvp 

S S T B K K A 

and devoid of a tandem repeat array downstream thereof . 

Especially, the present invention provides a 
biochemically pure MUCl protein, selected from the group 
consisting of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, 
MUCl/W, MUCl/W/alt, MUCl/E. and MUCl/Z/alt, or a functional 
derivative thereof, having a partial amino acid sequence; 

MTPGTQSPFFLLLLLTVLT [ATTAPKPAT] 
VVTGSGHASSTPGGEKETSATQRSSVP 
S S T E K N A 

and devoid of a tandem repeat array downstream thereof . 

Furthermore, the present invention provides a 
biochemically pure MUCl protein selected from the group 
consisting of MUCl/V, MUCl/V/alt, or a functional derivative 
thereof, comprising a partial amino acid sequence; 

MTPGTQSPFFLLLLLTVLT [ATTAPKPAT] 
VVTGSGHASSTPGGEKETSATQRSSVPS 

and devoid of a tandem repeat array downstream thereof . 
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Still furthermore, the present invention provides a 
biochemically pure MUCl protein selected from the group 
consisting of KWCl/V, MUCl/V/alt, or a functional derivative 
thereof, having a partial amino acid sequence: 

MTPGTQSPFFLLLLLTVLTtATTAPKPATl 
VVTGSSHASSTPGGEKETSATQRSSVPS 

and devoid of a tandem repeat array downstream thereof. 

The sequence starts at the amino (NH3) terminal 
methionine (M) residue. The 9 amino acid sequence presented 
in brackets [ATTAPKPATl represents an isoform that 
is generated by an alternative splice acceptor site. 
Hereinafter, MUCl derivaties containing this additional 9 
amino acid sequence will be referred to as the "/alt 
configuration" of the novel HUCl derivatives described 
herein. The. two arrows indicate the sites at which cleavage 
of the signal sequence is expected to occur {Fig. 2). 

Specifically, the present invention provides 
biochemically pure MUCl/X and MUCl/x/alt, respectively 
comprising the sequences shown in Figs. 5A and SB and 
.functional derivatives thereof; biochemically pure MUCl/Y 
and MUCl/Y/alt respectively comprising the sequences shown 
in Figs. 6A and 6B and functional derivatives thereof; 
biochemically pure KUC 1/V, MUCl/V/alt, respectively 
comprising the sequences shown in Figs. 6C and 6D and 
functional derivatives thereof; MUCl/W and MOCl/W/alt 
respectively comprising the sequences shown in Figs. 7A and 
7B and functional derivatives thereof; and biochemically 
pure MUCl/Z and MUCl/Z/alt respectively comprising the 
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sequences shown in Figs. 8A and 86 and functional 
derivatives thereof. 

More particularly, the present invention provides 
biochemically pure MUCl/X and MtJCl/X/alt, respectively 
having the sequences shown in Figs. 5 A and 5B and 
functional derivatives thereof; biochemically pure MUCl/Y 
and MUCl/Y/alt respectively having the sequences shown 
in Figs. 8A and 6B and functional derivatives thereof; 
biochemically pure MUC 1/V, MUCl/V/alt, respectively 
comprising the sequences shown in Figs. 6C and SD and 
functional derivatives thereof; UOUCl/W and MUCl/W/alt 
biochemically pure MUCl/W and E5UCl/w/alt respectively having 
the sequences shown in Figs . 7A and 7B and functional 
derivatives thereof; and biochemically . pure MUCl/Z and 
MUCiyz/alt respectively having the sequences shown in 
Figs. 8A and SB and functional derivatives thereof. 

MUCl/X and MUCl/Y have been found to be generated by a 
splicing mechanism, using perfect splice donor and splice 
acceptor sites, located upstream and downstream to the 
tandem repeat array of MtJCl while maintaining the original 
reading frame, and therefore these proteins retain the 
cytoplasmic and transmembrane domains ^ as well as the amino 
.acids immediately N- terminal to the transmembrane domain 
{Figs. lA and IB, Fig. 2, Fig. 3 and Fig. 4). 

MUCl/v has been found to be generated by a splicing 
mechanism, using a different splice donor and splice 
acceptor sites, located upstream and downstream to the 
tandem repeat array of MUCl while also maintaining the 
original reading frame and therefore these proteins retain 
the cytoplasmic and transmembrane dcmiains, as well as the 
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amino acids imniediately N-tenninal to the transmembrane 
domcdn. 

On the other hand, MUCl/W and MUCl/Z are generated by a 
splicing mechanism in which the original reading frame is 
not maintained and therefore the proteins do not include the 
cycloplasfflic and transmenibrane domains {Figs. lA and IB, 
Fig. 2, Fig. 3 and Fig. 4) and are therefore secreted from 
the cell. 

Further extensive research, testing and analysis 
indicate that MUCl/X, MUCl/Y^ MUCl/V and their /alt 
configurations serve as receptor proteins in breast cancer 
cells, while MllCiyw and MUCl/Z and their /alt configurations 
function as Ixgands for said receptors. 

In contrast to the new MUCl/X, MUCl/X/alt, MUCl/Y, 
MUCl/Y/alt, KUCl/V, and MUCl/V/alt proteins that are 
continuous from their N-terminal extracellular domains 
through to their C-terminal cytoplasmic domains ( Fig . 3 , 
Fig. 12 and Fig. 13), the tandem repeat array containing 
MUCl protein is proteolyticaliy cleaved in its extracellular 
domain [Ligtenberg, et al., "cell Associated Episialin Is a 
Complex Containing Two Proteins Derived From a common 
Precursor/' J. Biol. Ghent. , Vol. 267, pp. 6171-6177 (19921]. 
Integrity of the MUCl extracellular domain as in the MUCl/X, 
MUCl/x/alt, MUCl/Y, MUCl/Y/alt, MUCl/V and MUCl/V/alt 
proteins is likely to be essential for ligand binding. 

Furthermore, the MOCl amino acid sequence reveals 
striking similarities to sequences in the extracellular 
domain of cytokine receptors that are known to participate 
in ligand binding. Significantly, this homology maps in 
close proximity to the region where proteolytic cleavage 
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occurs in the tandem repeat array containing MUCl protein, 
suggesting that integrity of this site in the MUCl/X, 
MUCl/X/alt, HUCl/y, MUCl/Y/alt, MUCl/V and MUCl/V/alt 
proteins is of prime importance for both ligand binding and 
signal transmission. This demonstrates that the MUCl/X, 
MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V and MUCl/V/alt 
proteins are cytokine-like receptor molecules. 

Furthermore, experiments carried out with the MUCl 
proteins described previously in the literature and which 
are characterized by the presence of the tandem repeat 
array, showed that these proteins do not transform cells 
into canceroxos cells, and specifically when expression 
vectors containing cDNA coding for the tandem repeat array 
MUCl protein were transfected into eucaryotic cells, the 
said transfectants did not become tumorigenic. In 
contradistinction thereto, transfection of expression 
vectors containing cDNA coding for the MUCl/Y protein of the 
present invention into cells, caused the said, cells to 
become tumorigenic, as described hereinbelow. 

As is known, the , biological effects of many factors 
controlling cell proliferation, differentiation and 
met£Lbolism are mediated by membrane- located proteins 
(receptors) that participate in signal transduction 
processes. Invariably, growth factor binding to specific 
cell surface receptors initiates a signalling cascade that 
is transduced in many cases via phosphorylation of tyrosine 
residues within the receptor protein [M.J. Pasain and L.T. 
•Williams, "Triggering Signalling Cascades by Receptor 
Tyrosine Kinases," TIBS , Vol. 17, pp. 374-378 (1992)1. 
Assembly of receptor signalling complexes formed between the 
receptor protein and SRC homology 2 {SH2) domain containing 
proteins that interact with phospborylated tyrosine residues 
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present in the receptor cytoplasmic domain, mediates the 
signal transduction process. This triggering ultimately 
results in the activation of specific gene expression 
involving transcription of both inmediate and delayed 
response genes. 

A number of cell surface receptor proteins are likely 
invalved in both the origin and progression of human breast 
cancer - a prime example is the neu (erbB-2) membrane 
located receptor molecule [D.J. Slamon, et al., "Studies on 
the HER-2/neu Protooncogene in Human Breast and Ovarian 
Cancer," Science . Vol. 244, pp. 707-712 (1989)]. It is 
therefore unfortunate . to note, however, that only 
exceptionally few genes that code for signal transducing 
molecules in general, and membrane- located receptor proteins 
in particular, have to date been implicated in the 
development of human breast cancer. 

Thus, as stated above, there have now been identified 
and characterized novel protein products of the Mucl gene, 
designated herein as MUCl/X, MUCl/Y and MUCl/v, that reside 
in the call membrane and function as receptor proteins, and 
are highly expressed in human breast cancer tissue. There 
have also now been identified and characterized novel 
protein products of the MUCl gene, designated herein as 
MUCl/W and MUCl/Z, the latter of which has been found to 
function as ligands , and the farmer of which is believed to 
have a similar function, based on its structure. 

These proteins and the /alt configurations thereof, as 
well as functional derivatives thereof, form the basis of 
the present invention. 
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Thus, the present invention further provides a 
pharmaceutical compasltlon comprising as an active 
ingredient therein a biochemically purified MrtJCl protein 
selected from the group consisting of MUCl/X, MOCl/X/alt, 
mem, MtlCl/Y/alt, MUCl/V, MaCl/V/alt, HUCl/tf, MUCl/W/alt, 
MUCl/Z, MUCl/Z/alt and functional derivatives thereof, 
devoid of a tandem repeat array. 

More specifically, the present invention provides, 
inter alia, a phamaaceutical composition for the treatment 
of human breast cancer, comprising as an active ingredient 
therein a biochemically pure HUCl protein selected from the 
group consisting of HUCl/X, MUCl/X/alt, MUCl/Y, Mlcl/Y/alt, 
MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z, MUCl/S/alt 
and functional derivatives thereof, in soluble form and In 
combination with a pharmaceutically acceptable carrier. 

The invention also provides a conjugated toxin for the 
treatment of human breast cancer, comprising a MUCl protein 
selected from the group consisting of MUCl/W, HUCl/W/alt, 
MUCl/Z, MUCl/Z/alt, and functional derivatives thereof, 
attached to a cytotoxic agent. 

In another aspect of the present invention, there is 
provided a diagnostic agent for the detection of human 
breast cancer cells, comprising a detectable labelled MUCl 
protein selected from the group consisting of MUCl/W, 
MUCl/w/alt, MUCl/Z, MUCl/Z/alt, and functional derivatives 
thereof . 

The invention also provides a diagnostic agent for 
identification of sites in the body to which breast cancer 
cells have spread, comprising a detectable labelled MUCl 
protein selected from the group consisting of MUCl/W, 
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MUCl/W/alt, MUCl/Z, HUCl/Z/alt, and functional derivatives 
thereof. 

As will be realized from the above, the invention also 
includes a method for the treatment of human breast cancer, 
comprising administering to an individual having human 
breast cancer cells an amount of solvible ISUCl/X, MUCl/X/alt, 
MUCl/Y, MUCl/Y/alt, MUCl/V, or MJCl/V/alt receptors, 
.sufficient to inhibit the binding of MOCl ligands to said 
cells. 

In yet another aspect of the present invention, there 
is provided a method for the treatment of human breast 
cancer , comprising administering to an individual having 
human breast cancer cells an amount of a ligand-toxin 
conjugant comprising a ligand selected from MUCl/W, 
MUCl/w/alt, MUCl/Z or MUCl/E/alt, fused to a cytotoxic 
toxin. 

The MUCl/Z and MUCl/w proteins may be used: 
a) for breast cancer diagnosis and prognosis, both in 
vivo and in vitro; 

■ b) for imaging cancer tissue; and 

c) for therapy of breast cancer patients. 

Breast Cancer Diagnosis and Procmoais 

As the MUCl/W and MUCl/Z proteins are synthesized by 
breast cancer tissue and are secreted from the cell, their 
serum levels can serve as markers for the disease. Assays 
employing antibodies directed against the MUCl/W and MUCl/Z 
proteins are used to analyse the serum levels of these 
proteins . This provides a means for diagnosing individuals 
with early breast cancer,, and/or for monitoring the 
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progression of breast cancer in patients who already have 
been diagnosed. 

. In general, ELISAs are the preferred inununoassays 
employed to assess the amount of the new proteins described 
and claimed herein present in a specimen. ELISA assays are 
well-known to those skilled in the art. Both polyclonal and 
monoclonal antibodies can be used in the assays. Where 
appropriate, other immunoassays, such as radioinonunoassays 
(RIA) can be used, as known to those skilled in the art. 
Available immunoassays are extensively described in the 
patent and scientific literature. See, for example, U.S. 
Patents 3,791,932; 3,839,153; 3,S50,752; 3,850,578; 
3,853,987; 3,867,517; 3,879,262; 3,901,654; 3,935,074; 
3,984,533; 3,996,345; 4,034,074 and 4,098,876, as well as 
Sambrook, et al.. Molecular Cloning! A Laboratory Manual . 
Cold Spring Harbor Laboratory, New York, U.S.A. (1989), and 
E- Harlow and D. Lane, Antibodies: A Laboratory Manual , 
Cold Spring Harbor Laboratory, New York, U.S.A. (1988). 



Imaging of Breast Cancer Tissue 

The identification of sites in the body to which breast 
cancer cells have spread is of prime importance for the 
^successful eradication of the. disease. The MUCl/2 ligand 
specifically homes in onto breast cancer cells expressing 
the target MUCl/X, MUCl/Y and MUCl/v receptor molecules, 
providing the means for. efficiently localizing cancerous 
tissue. Imaging is performed by tagging the HUCl/Z ligand 
with, for example,' radioactivity, injecting the labelled 
MUCl/Z protein into the patient, and monitoring its 
localisation within the body. 
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TheraTjy of Breast Cancer Patientis with Mgand 

1. Ligand as a Drug-Delivery System 

Using the MOCl/Z ligand as a drug delivery system, 
ligand-toxin conjugates are prepared, such as MUCl/Z fused 
to a cytotoxic toxin. 

The toxin thus specifically homes in onto the target 
breast cancer cell, which is then killed. Alternatively, 
the ligand is labelled with cytotoxic levels of 
radioactivity. The target breast cancer cells are then 
directly eradicated by the radioactively- labelled ligand. 

2. Blockade of MUCl/X, MITCl/Y and WICl/V Receptors without 
RecBptor Activation 

By using defined regions of the ligand that only bind 
to the receptor^ yet do not activate it^ it is possible to 
effectively "swamp" the receptors present on the breast 
.cancer cell with non-activating ligand. Receptor occupancy 
with non-activating ligands (antagonistic ligands) will 
preclude the binding of activating ligands, thereby limiting 
the growth of the breast cancer cell. 

The specification and claims provide guidance for the 
lise of the invention in humans. The Investigator's Handbook 
provided by the Cancer Therapy Evaluation Program, Division 
of Cancer Treatment, National Cancer Institute, U.S.A., 
indicates that the starting dose for Phase I trials is based 
on animal data such as rodent equivalent LD^a- Further, the 
manual (page 22) indicates that animal studies carried out 
prior to Phase I trials provide the investigator with a 
prediction of the likely effects. [See also J-S. Driscoll, 
"The Preclinical New Drug Research Program of the National 
cancer Institute," Cancer Treatment Reports , Vol. 68, 
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pp. 63-75 11984).] Therefore, the data accumulated in a 
mouse model Is not only acceptable in detemiuing htunan 
doses and protocols, but is considered highly predictive. 

The new KUCl proteins of the present invention, i.e., 
the proteins selected from the group of proteins consisting 
of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V, 
MUCl/V/alt, HDCl/W, MUCl/W/alt, MUCl/Z and MUCl/Z/alt, as 
well as their functional derivatives as defined herein, are 
prepared by recombinant DNA technology and polypeptide 
synthesis . 

Thus, the new MUCl proteins of the present invention 
are prepared liy culturing a host cell transformed with an 
expression vector comprising DNA encoding an amino acid 
sequence of the new MUCl proteins in a nutrient medium, and 
recovering the new MUCl proteins from the cultured broth. 

Particulars of the above-mentioned process are 
explained in detail below. 

The host cell may include a microorganism [bacteria 
.(e.g., Escherichia coli , Bacillus subtilis , etc.); yeast 
(e.g., Saccharomyces cerevisiae , etc.)], cultured human or 
animal cells [e.g., CHO cell, L929 cell, etc.), cultured 
plant cells, and cultured insect cells. Preferred examples 
of the microorganism include bacteria, especially a strain 
belonging to the genus Escherichia (e.g., E. coli HB-101, 
ATCC 33694; E. Coli HB-101-16, FEBM BP-1872; E. coli 294, 
ATCC 31446? E. coli X-1776, ATCC 31537, etc.); yeast, animal 
cell lines {e.g., mouse L929 cell, Chinese hamster ovary 
(CHO) cell, etc.), and the like. 
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When the bacterium, especially E . coli . is used as a 
host ceil, the expression vector usually conoprises at least 
a promoter-operator region, initiation codon, DKA encoding 
the amino acid sequence of the new MOCl proteins, 
termination codon, terminator region, and replicatable unit. 
When yeast or an animal cell is used as host cell, the 
expression vector is preferably composed of at least 
promoter, initiation codon, DNA encoding the amino acid 
sequence of the signal peptide and the new MUCl proteins, 
and termination codon, and it is possible that enhancer 
sequences, 5'- and 3'-noncoding region of the native MUCl 
proteins, splicing junctions, polyadenylation site and 
replicatable unit are also inserted into the expression 
vector . 

The promoter-operator region comprises promoter, 
operator and Shine-Dalgarno (SD) sequence (e.g., AAGG, 
etc.). Examples of the promoter -operator region include 
conventionally employed promoter -operator region (e.g., 
lactose-operon, PL-promoter, trp-promoter , etc.) and the 
promoter for the expression of the new MUCl protein in 
mammalian cells may include HTLV -promoter , SV40 early- or 
late-promoter, LTR-proraoter , mouse metallothionein I IMMT)- 
promoter and vaccinia-promoter. 

Preferred initiation codon includes methionine codon 
(ATG) . 

The DNA encoding signal peptide includes the DNA 
encoding signal peptide of the new MUCl proteins: 

The DNA encoding the amino acid sequence of the signal 
peptide or the new MUCl proteins is prepared in a 
conventional manner, such as a partial or whole DNA 
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synthesis using DNA synthesizer and/or treatsnent o£ the 
complete DNA sequence coding for native or mutant MUCl 
proteins inserted in a suitable vector obtainable from a 
transformant or genome in a conventional manner (e.g., 
digestion with restriction enzyme, dephosphorylation with 
bacterial alkaline phosphatase, ligation using T4 DNA 
ligase}. 

The termination codon(s) include conventionally 
employed termination codon {e.g., TAG, TGA, etc.). 

The terminator region contains natural or 'synthetic 
terminator (e.g., synthetic fd phage terminator, etc). 

The replicatable unit is a DNA sequence capable of 
replicating the whole DNA sequence belonging thereto in the 
host cells and includes natural plasmid^ artificially 
modified plasmid (e.g., DNA fragment prepared from natural 
plasmid) and synthetic plasmid, and preferred examples of 
the plasmid include plasmid pBR 322 or artificially modified 
plasmid thereof {DNA fragment obtained from a suitable 
restriction enzyme treatment of pBR 322) for £. coli ; 
plasmid pRSVneo ATCC 37198, plasmid pSV2dhfr ATCC 37145, 
piasmid pdBPV-MMTneo ATCC 37224, plasmid pSV2neo ATCC 37149. 
for raainmalian cell. 

The enhancer sequence includes the enhancer sequence 
172 bp) of SV40. 

The polyadenylation site includes the polyadenylation 
site of SV40. 

The splicing junction includes the splicing junction of 

SV40. 
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The pronioter-operatar region, initiation codon, DNA 
encoding the amino acid sequence of the new HUCl proteins ^ 
termination codon(s) and terminator region are consecutively 
and circularly linked together with an ade<iuate replicatable 
unit (plasmid) if desired, using adet^uate DNA fragment(s) 
(e.g., linker, other restriction site, etc.) in a 
conventional manner (e,g., digestion with restriction 
enzyme, phosphorylation using T4 polynucleotide kinase, 
ligation using T4 DNA ligase) to give an expression vector. 
When mammalian cell line is used as a host cell, it is 
■possible that enhancer sequence, promoter, 5'-noncoding 
region of the cDNA of the native MUCl proteins, initiation 
codon, DNA encoding amino acid sequences of the signal 
peptide and the new MUCl termination cQdon(s), 3'-nQncoding 
region, splicing junctions and polyadenylation site are 
consecutively and circularly linked together with an 
adequate replicatable unit in the above manner. 

The expression vector is inserted into a host cell by 
methods known per se. The insertion is carried out in a 
conventional manner (e.g.^ transformation including 
transf ection, microinjection, etc.) to give a transformant 
including transf ectant. 

For the production of the new MUCl proteins in the 
process of the present invention, thus obtained transformant 
comprising the expression vector is cultured in a nutrient 
medium. 

The nutrient medium contains carbon source(s) te.g,, 
glucose, glycerine, nsannitol, fructose, lactose, etc.) and 
inorganic or organic nitrogen source (s) (e.g., ammonium 
sulfate, airenonium chloride, hydrolysate of casein, yeast 
extract, polypeptone, bactotrypton, beef extracts, etc.). If 



WO9«/03502 



FCT/IB9S/a0627 



~ 20 - 



desired, other nutritious sources [e.g., inorganic salts 
(e.g., sodium or potassium biphospbate, dipotassium hydrogen 
phosphate, magnesium chloride, magnesium sulfate, calcium 
chloride), vitamins {e.g., vitamin Bl) , antiiiottcs (e.g., 
ainpicillin) , etc.] are added to the medium. For the culture 
of mammalian cell, Dulbecco's Modified Eagle's Minimum 
Essential Medium (DMEM) supplemented with fetal calf serum 
and an antihiotic is often used. 

The culture of traucusformant is generally be carried out 
at pH 5.5-8.5 (preferably pH 7-7.5) and 18-40*c (preferably 
25-38'C) for 5-50 hours. 

VJhen a bacterium such as E. coll is used as a host 
cell, thus produced new MUCl proteins generally exist in 
cells of the cultured transformant and the cells are 
collected by filtration or centrifugation, and cell wall 
and/or cell membrane thereof are destroyed in a conventional 
manner (e.g., treatment with supersonic waves and/or 
lysozyme, etc.) to give debris. From the debris, the new 
Mtrci proteins are purified and isolated in a conventional 
manner, as generally employed for the purification and 
isolation of natural or synthetic proteins [e.g., 
dissolution of protein with an appropriate solvenr (e.g., 8M 
aqueous urea, 6M aqueous guanidium salts, etc.), dialysis, 
gel filtration, column chromatography, high performance 
liquid chromatography, etc.]. When a mammalian cell Is used 
as a host cell, the produced new MUCl proteins generally 
exist in the culture solution. The culture filtrate 
(supernatant) is obtained by filtration or centrifugation of 
the cultured broth. From the culture filtrate, the new MUCl 
proteins are purified in a conventional manner. 
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As will be realized, having now identified the new MUCl 
proteins of the present invention, purified antibodies, both 
polyclonal and monoclonal, which specifically bind 
respectively to each of said proteins can be readily 
prepared by methods per se known in the art- Once said 
antibodies are prepared, they can be conjugated to a 
therapeutic drug or a detectable moiety and/ar bound to a 
solid support. 

The preparation of said antibodies also enables the 
carrying-out of a bioassay for determining the amount of a 
MUCl protein selected from the group consisting of MUCl/X, 
MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MOCl/V, MUCl/V/alt, MUCl/W, 
MUCl/W/alt, MUCl/Z and MUCl/Z/alt or a functional derivative 
thereof devoid of a tandem repeat array, comprising (a) 
contacting the biological sample with an antibody under 
conditions such that a specific complex of the antibody and 
said mcl protein can be formed; and (b) determining the' 
amount of the antibody /MUCl protein complex, the amount of 
the complex indicating the amount of said KUCl protein in 
the biological sample, and allows the method of detecting 
the presence of a cancer in a subject comprising determining 
■ the presence of a detectable amount of said MUCl protein in 
a biopsy from the subject, the presence of a detectable 
amount of said MUCl protein relative to the absence of MUCl 
■protein in a normal control indicating the presence of a 
cancer, and the method of determining the prognosis of a 
subject having cancer, comprising determining the presence 
of a detectable amount of said MUCl protein in a biopsy from 
the subject, the presence of a detectable amount of MUCl 
protein relative to the absence of said MUCl protein in a 
normal control indicating a decreased chance of long-term 
survival. 
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While the invention will now be described in connection 
with certain preferred embodiments in the following examples 
and with reference to the following illustrative figures so 
that aspects thereof may be more fully understood and 
appreciated, it is not intended to limit the invention to 
these particular embodiments- On the contrary, it is 
intended to cover all alternatives, modifications and 
equivalents as may be included within the scope of the 
invention as defined by the appended claims. Thus, the 
following examples which include preferred embodiments of 
the novel proteins, the functional derivatives thereof, the 
comhinatian thereof with . cytotoxic agents and detectably 
labelled markers, as well as the preparation of DMA 
constructs, vectors, and transfected hosts encoding and 
incorporating the same, and the various uses thereof, will 
serve to illustrate the practice of this invention, it being 
understood that the particulars shown are by way of example 
and for purposes of illustrative discussion of preferred 
embodiments of the present invention only and are presented 
in the cause of providing what is believed to be the most 
useful and readily understood description of farmulation 
procedures as well as of the principles and conceptual 
aspects of the invention. 



BHIEF DESCRIPTIOS OF THE DRAWINGS 

In the drawings: 

Fig. lA is a scheme of alternative splice events (W, X, Y 
and Z) that delete the KUCl tandem repeat array and 
flanking sequences; 

Fig. IB is a scheme of alternative splice events (W, X, Y, 
and 2) and nucleotide sequence of the regions 5' 
flanking the AG consensus splice acceptor site; 
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Fig. 2 shows amino terminal amino acid sequences of the MUCl 
proteins, demonstrating the two variant MUCl signal 
peptide faxms and sites of signal peptide cleavage; 

Fig. 3 is a scheme of the repeat array containing MUCl 
protein [upper molecule) and the novel MOCl/W, MUCl/X, 
MtJClY and MUCIZ proteins generated by alternative 
splicing; 

Fig. 4 is a scheme of the repeat array containing MUCl/ait 
protein that has the variant signal peptide at its N- 
terminal and the novel MUCl/Y/alt, BtUCl/X/alt, 
MUCl/W/alt and MUCl/Z/alt proteins generated by 
alternative splicing; 

Fig. 5A shows the amino acid sequence of the MUCl/X 
protein; 

Fig. 5B shows the amino acid sequence of the MUCl/X/alt 

protein ; 

Fig. 6A shows the amino acid sequence of the MUCl/Y protein; 
Fig. 5B shows the amino acid sequence of the MOCl/Y/alt 
protein; 

Fig. 6C shows the amino acid sequence of the MUCl/V protein; 
Fig. St) shows the amino acid sequence of the MUCl/V/alt 
protein; 

Fig. 7A shows the amino acid sequence of the MUCX/W protein; ' 
Fig. 7B shows the amino acid sequence of the MUCl/W/alt 
protein; 

Fig. BA shows the amino acid sequence of the MUCl/Z protein; 
Fig. SB shows the amino acid sequence of the MUGl/2/alt 
protein; 

Fig. 9 illustrates the overexpression of the novel MUCl/X, 
MUCl/Y, and MUCl/v proteins in human breast cancer 
tissue and post-translational modification by 
phosphorylation; 

Fig. 10 illustrates phosphorylation on tyrosine residues of 
the MOCl/Y protein; 
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Fig. 11 depicts the binding of tyrosine phosphorylated HUCl 

cytoplasmic domain to SH2 domains? 
Fig. 12 is a scheme depicting the repeat array containing 

MUCl protein (upper drawing) and the novel MUCl/Y 

protein (lower drawing]; 
Fig. 13 is a scheme depicting the location of tyrosine and 

cysteine residues in the MUCl proteins; and 
Fig. 14 is a comparison scheme of HUCl sequences and 

seguences known to interact with SH2 domains. 



DETAILED DESCRIPTION OF TEE INVENTION 

With regard to the attached drawings, the following is 
a more detailed description thereof, so that the same can be 
mors readily understood: 

Fig. lA : Scheme of alternative splice events (W, X, Y 
■and Z) that delete the HUCl tandem repeat array and flanJting 
sequences. The MUCl genomic se<iuence is indicated by the 
continuous line. The various splice events (W, X, Y and Zl 
that delete the tandem repeat array are indicated. The 
dinucleotides at the splice donor and splice acceptor sites 
are indicated by GT and AG, respectively. The X and Y 
splices retain the same reading frame (RF) as the MUCl 
protein., whereas W and Z change the reading frame. The 
signal peptide and the transmembrane domains are indicated 
by SIG and IM, respectively. 

F ig , . IB ; Scheme of alternative splice events (W, X, Y 

and Z) and 5' sequences flanking the splice acceptor slte- 
The pyrimidine-rich sequences 5' flanking the w, X, Y and Z 
splice acceptor sites are shown. Other symbols are as in 
Fig- lA. 
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Fig. 2 : Alternative WJCl tr-temdnal signal peptide 
sequences. The amino terminal (K-bemlnal) amino acid 
sequence is presented using the one letter code. The lower 
sequence represents the N-terrainal sequence that includes an 
extra 3 amino acids (boxed sequence) that is generated by an 
alternative splice event. Numbers appearing above the amino 
acid sequence represent the probability (calculated 
according to the Von Heijne signal peptide cleavage rules; 
arbitrary units- are used) of signal peptide cleavage 
occurring at that site. The upward-facing arrow represents 
the most likely site of signal peptide cleavage. 

Fig. 3 ; Schaite of the repeat array containing MtTCl 
protein (upper molecule) and the novel MQCiyY, MUCl/X, 
MUCl/W and MUCl/Z proteins. The novel MUCl/Y, MUCl/X, 
MUCl/W and HUCl/Z proteins are generated by alternative 

splicing events that delete the central tandem repeat array 
(compare upper and lower molecules). All MUCl forms contain 
a hydrophobic N-terminal signal sequence {slashed box at 
left o£ figure) that is co-translationally cleaved (arrow at 
left of figure). This is followed by the tandem repeat 
array (upper molecule) that is illustrated by the block of 
closely-spaced vertical lines. The highly hydrophobic 29 
amino acid stretch constituting the transmembrane domain 
"(TM) is shown at the C- terminal end of both MUCl proteins, 
followed by the cytoplasmic domain (CYT). The region 
comprising the proteolytic cleavage site f;Ligtenberg, et 
al., J. Biol. Chem. , ibid. (1992)] of the repeat array 
containing MUCl protein (upper molecule) is indicated by the 
two vertical dotted lines just N-terminal to the 
transmembrane domain. Potential N- linked glycosylation sites 
are shown with an asterisk (*). The w and 2 splice events 
alter the reading frame of the MUCl protein downstream to 
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their respective splice acceptor sites, and therefore 
contain downstream amino acid sequences that differ from the 
MUCl/Y and MUCl/X proteins. 

Fig. 4 i Schane of the repeat array containing MUCl/alt 
protein that has the variant signal peptide at its 
W-tenninal and the novel MUCl/Y/alt, MDCl/X/alt, MUCl/w/alt 
and MtICl/2/alt proteins generated by alternative splicing. 
The altered N-terminal {see Fig, 2) resulting from the 
altered signal peptide is illustrated immediately distal to 
the slashed box at the N - terminus . All the resulting novel 
MUCl/Y/alt, MUCl/X/alt, MUCl/W/alt and MUCl/Z/alt proteins 
will accordingly have the variant N- terminus. Other syiabols 
are as in Fig. 3. 

Fig. 5A : Amino acid sequence of the MUCl/X protein. 
The ajTiino acid sequence (one letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
methionine residue. Numbering of the amino acid residues is 
shown to the right of the figure. 

Fig. 5B : Amino acid sequence of the MUCl/X/alt 

protein. The amino acid sequence (one letter code) is shown 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure . 

Fig. 6A : Amino acid sequence of the MDCl/y protein. 
The amino acid sequence tone letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
methionine residue. Numbering of the amino acid residues is 
shown to the right of the figure. 
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Ficr. 6B : Amino acid sequence of the MUCl/Y/alt 
protein. The amino acid sequence (one letter code) is shown 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure. 

Fig. 6C ; Amino acid sequence of the MOCl/V protein. 
The amino acid sequence {one letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
methionine residue. Numbering of the amino acid residues is 
shown to the right of the figure. 

Fig, 6D ; Amino acid sequence of the WJCl/V/alt 
protein. The amino acid sequence (one letter code) is shown 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Nujnbering of the amino acid 
residues is shown to the right of the figure. 

Fig. 7A : Amino acid sequence of the MUCl/VJ protein. 
The amino acid sequence (one letter code] is shown beneath 
the nucleotide sequence and begins with the initiating 
methionine residue. Numbering of the amino acid residues is 
shown to the right of the figure. 

Fig. 7B : Amino acid sequence of the MUCl/W/alt 
. protein . The amino acid sequence ( one letter code ) is shown 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure; 

Fig. 8A : Amino acid sequence of the SSUCl/K protein - 
The amino acid sequence (one letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
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me-thionine residue. Numbering of the amino acid residues is 
shown to the right of the figure. 

Fig. 8B ; Amino acid sequence of the MUCl/Z/alt 
protein. The amino acid sequence tone letter code) is shown 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure. 

Fig. 9 : Overexpression of the novel MDCl/X/ MtlCl/Y and 
MUCl/V proteins in human breast cancer tissue and post- 
translational modification by phosphorylation. 

{A) Cell lysates prepared from breast cancer cells 
( lane 2 ) , primary human breast cancer tissues from 3 
different patients [lanes 1, 4 and 5} and the adjacent 
normal breast tissues (lanes 3 and 6), were analyzed by 
SDS-polyacrylamide gel electrophoresis (SDS-PftGE) , 
transferred to nitrocellulose and iramunoblotted with a 
rabbit polyclonal antibody directed against the DTUCl 
cytoplasmic domain. The regions of specific 
immunoreactivitY are indicated by the 3 open arrows to the 
left of the figure. 

(B) The novel MUCl/Y protein may be post- 
translationally modified by phosphorylation. Radioactive 
inorganic phosphate was added to stable Ras 

transformed 3T3 cell transf ectants expressing the iyruci/Y 
protein and following a 5-hour incubation the cells were 
lysed. Cell lysates subjected to immunoprecipttation with 
either pre-immune serum or with immune sertm generated 
against the 62 C-terminal amino acids of the MUCl 
cytoplasmic domain {lanes 1 and 2, respectively) were 
analyzed by SDS-PAGE, followed by autoradiography. The 
phosphorylated MUCl/Y protein is clearly visible in lane 2 



W09«yn35»2 



- 29 - 



(arrow to the right of the figure). Molecular size 
standards are indicated at left of figures in kilodaltons. 

Fjg. 10 i Phosphorylation on tyrosine residues of the 
HUCl/Y protein. The innnmnoprecipitated phosphorylated MUCl 
proteins [from lane 2 in Fig. 9(B)] were isolated from SDS- 
acrylamide (10%) gel and hydrolyzed in 6M HCl at llO^C for 
1 hour. Labelled phosphoaminoacids (with added unlabelled 
internal phosphoamino acid markers) were analyzed by thin- 
layer high voltage electrophoresis, followed by 
Phosphoijnager analysis. The position of migration of 
phosphoserine^ phosphothreonine and phosphotyrosine are 
indicated by PS, ET and py respectively, and inorganic 
phosphate is shown by Pi. 

Fig. 11 ; Binding of tyrosine phosphoarylated MUCl 
cytoplasmic domain to SH2 domains. 

(A) The complete 72 amino acid sequence of the human 
MUCl cytoplasmic domain is shown, using the one letter amino 
acid code. Indicated below this are changes in the mouse 
MPEJCl homologue. The 7 tyrosine residues in the cytoplasmic 
domain are highlighted with an asterisk, and likely sites of 
interaction between phosphotyrosine-containing peptide 
sequences (boxed regions within the cytoplasmic domain amino 
acid sequence) and SH2 domain containing proteins tboxed at 
the bottom of the figure} are shown. The cysteine- 
containing sequence is circled at the N-terminal of the 
cytoplasmic domain. 

(B) Recombinant MUCl cytoplasmic domain was 
synthesized as a fusion protein with N-terminal DHFS. protein 
(from Halobacterium) using the pET system- The gel purified 
recombinant protein was in-vitro tyrosine phosphorylated by 
incubation with gairana ^=^P-ATP and highly purified SGF 
receptor (EGF-R) protein isolated from A431 cells. The 
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radioactively- labelled MUCl cytoplasmic domain was 
repurified from a SDS-acrylamide (10%) gel and incubated 
overnight at 4''C, with either GST {glutathione transferase) 
beads alone (lane 1), or with GST/GEB-2 fusion protein beads 
(GEB-2, lane 2). The beads were then extensively washed and 
labelled bound proteins analyzed by SDS-PAGE. Specific 
GHB-2 binding of labelled MOcl cytoplasmic domain is 
indicated by the arrow to the right of the figure. 

(C) Labelled tyrosine phosphorylated MUCl cytoplasmic 
domain, purified, by SDS acrylajnide (10%) gel, was incubated 
with agarose beads bound to src SH2 domain (src,- lane 1), 
the C-tenninal p85 phosphatidyl inositol (PI) 3' kinase SH2 
domain (EI, lane 2), and the N- terminal phospholipase C 
ganma 1 SH2 domain (lip. C, lane 3) and analysed as 
described above. Specific binding to the src and 
phospholipase C S32 domains (lanes 1 and 3, respectively) is 
indicated by the arrow to the right of the figure. No 
binding was observed to the C-terminal pSS (PI) 3 'kinase SH2 
domain ( lane 2 ) . 

Fig. 12 : Scheiae showing the repeat array containing 
MUCl protein (upper drawing) and the novel MUCl/Y protein 
(lower drawing). The novel MUCl/Y form is generated by an 
alternative splicing event that deletes the central tandem 
repeat array (ccwnpare upper and lower molecules). Both MUCl 
forms contain a hydrophobic N-terminal signal sequence 
(slashed box at left of figure) that is co-translationally 
cleaved (arrow at left of figure). This is followed by the 
tandem repeat array (upper molecule) that is illustrated by 
the block of closely-spaced vertical lines- The highly 
hydrophobic 28 ajuino acid stretch constituting the 
transmembrane domain (TM). is shown at the C-terminal end of 
both MUCl proteins, followed by the cytoplasmic domain 
(CYT). The region comprising the proteolytic cleavage site 
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[Ligtenberg, et al., J. Biol. Chem. . ibid. (19921] of the 
repeat array containing MUCl protein (upper molecules) is 
indicated by the two vertical dotted lines just N- terminal 
to the transmembrane domain. "Che regions recognized by the 
anti-repeat and anti- cytoplasmic domain (anti-cyt) 
antibodies are indicated and potential N-linked 
glycosylation sites are shown with an asterisk ( * ) . 

Fig. 13 ; Scheme showing the location of tyrosine and 
cysteine residues in the Mirci proteins. The location of 
tyrosine and cysteine residues are indicated above the 
rectangles by vertical lines and asterisks, respectively- 
Both MUCl forms contain a hydrophobic N-terminal signal 
sequence (slashed box at left of figure) that is co- 
translationally cleaveid (arrow at left of figure). This is 
followed by the tandem repeat array (upper molecule) that is 
illustrated by the block of closely-spaced vertical lines. 
The highly hydrophobic 28 amino acid stretch constituting 
the- transmembrane domain (TM) is shown at the C- terminal end 
of both MUCX proteins, followed by the cytoplasmic domain 
(cyr). The region comprising the proteolytic cleavage site 
tLigtenberg, et al. , J. Biol. Chem. , ibid. (1992)1 of the 
repeat array containing MUCl protein (upper molecule) is 
indicated by the two vertical dotted arrows just N-terminal 
to the transmembrane domain. The regions recognized by the 
anti-cytoplasmic domain (anti-cyt) antibodies are 
indicated. 

Fig. 14 ; Phosphotyrosine-Containing Peptide Sequences 
Recognized by SH2 Damains and Their Comparison with MtJCl 
Cytoplasmic Domain sequences. The sequence specificity of 
the peptide-binding sites of SH2 domains has been previously 
determined using a phosphopeptide library [Songyang, at al. , 
cell . Vol. 72, pp. 767-778 (1993)1 and the data presented in 
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this Figure are in part from Table 3 of that reference. The 

preferred amino acids 1, 2 and 3 residues C-terminal to 
phosphotyrosine are indicated in the columns labelled 
pY + 1, pY + 2 and pY 3. The top line in each group 
relates to the most preferred sequence, with lowered 
preferences in the second and third lines. The boxed 
sequences correlate best with MUCl cytoplasmic domain 
sequences that are indicated in the right-hand column. 

Experimental work detailed below has unequivocally 
demonstrated that: 

a) the MUCl/X, MUCl/Y and HUCl/V proteins are highly and 
differentially expressed in breast cancer tissue as compared 
to normal breast tissue [see Fig. 9]; 

b) the MUCl/X, MTTCl/y and MUCl/v proteins are extensively 
phosphorylated tsee Fig. 9]; 

c) phosphorylation occurs almost exclusively on tyrosine 
residues [see Fig. 10]; 

d) the phosphorylated MUCl/X, MUCl/Y and MUCl/v proteins 
interact specifically with the SRc-hotnology (SH) domain SH2- 
and SH3 -containing proteins, GRB-2, SRC and phospholipase C 
gamma-l [Fig, 11]; and 

e) the MUCl/X, MUCl/Y and mJCl/V proteins potentiate the 
transformed phenotype of cells and significantly enhance the 
in-vivo tumorigenic potential of mammary epithelial cells. 

This experimental data demonstrates that the MUCl/X, 
MUCl/Y and MUCl/V proteins function as cell surface receptor 
molecules participating in signal transduction, and are 
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intimately related to the development of human breast 
cancer. 

To assess expression of the MUCl proteins in-vivo, 
extracts of hiraan tissue samples were rim on SDS denaturing 
gels, transferred and probed with polyclonal antibodies 
directed against the MUCl cytoplasmic domain. Analyses were 
performed on malignant breast tximor tissue sajnples [Fig. 9A, 
lanes 4 and 5], together with extracts from breast tissue, 
adjacent to the biopsied tumor sample [Fig. 9A, lanes 3 and 
6]. Little or no specific immunoreactivity was observed in 
the non-malignant breast tissue samples [Fig. 3A, lanes 3 
and 6 1 . 

In marked contrast thereto, proteins specifically 
reactive with the anticytoplasraic domain antibodies were 
highly expressed both in breast cancer cells grown in-vitro 
and in the primary breast cancer tissue samples [Fig. 9A, 
lanes 2, 4 and 5 respectively].. 

The immunoreactive proteins migrated to distinct 
positions correlating to molecular masses of approximately 
25-30, 3 5 [in the in-vitro grown breast cancer cells j. lane 
2j, and 40-43 kDa. Some of these immunoreactive proteins 
.may be generated by proteolytic cleavages occurring on the 
large polymorphic tandem repeat array containing MUCl 
protein at positions N-terminal to the transmembrane domain 
[Fig. 11, upper molecule, the two dotted arrows just 
N-terminal to the transmembrane domain]. However, the 
MUCl/X, MUCl/Y proteins [Fig. 12, lower molecule], and 
MUCl/V proteins are also likely represented by one or more 
of these immunoreactive proteins. In distinguishing between 
these possibilities, we were considerably aided by the 
identification of a third breast tumor tissue sample 
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[Fig. 9, lane 1], tljat expresses specific anticytoplasmic 
domain inrnunoreactive proteins with moleculeur masses of 
approximately 40-43 kDa and 35 kDa [compare Fig, 9, lanes 1 
and 21- Probing an identical immunoblot with monoclonal 
antibodies that recognize an epitope contained within the 
tandem repeat array, showed high levels of expression of the 
large pol3rmorphic MUCl proteins in the breast cancer cell 
samples correlating to lanes 2, 4 and 5 - no inmune reactive 
proteins corresponding to the large polymorphic MUCl 
proteins were detected in the third breast tumor correlating 
to lane 1 [data not shown]. These data suggest therefore 
that this third breast tumor tissue solely expresses the 
MUCl/X, MUCl/Y and MUCl/V protein forms and thereby indicate 
that the 35 and 40-43 kDa iiranunoreactive proteins are in 
fact the Mtrcl/X and MUCl/Y proteins . 



Tyrosine Phosphorylation of the MUCl/X, MUCl/Y and 
MUCl/v Proteins 

The calculated molecular mass of the MUCl/Y protein, as 
determined by its primary amino acid sequence, is 25,986 
Daltons. An increase in the molecular mass of the MUCl/Y 
protein [to 35 and 40-43 kDa proteins] may occur by 
post-translational modifications such as glycosylation 
and/or phosphorylation. To investigate whether the MUCl/Y 
protein is phosphorylated, radioactively-labelled inorganic 
phosphate was added to stable transf ectants expressing the 
MUCl/Y protein, and cell lysates were subjected to anti-MUCl 
cytoplasmic domain immunoprecipitation. 

Specifically immunoprecipitated MUCl/Y protein migrated 
with a molecular mass of 40-43 kDa, and demonstrated a 
prominent signal [Fig. 9b, lane 2], indicating that the 
40-43 kDa MUCl/Y proteins [Pig. 9] are phosphorylated 
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proteins. A phosphoamlno acid analysis performed on the 
isolated phosphorylated MilCl/Y protein shows that greater 
than 90% of the phosphorylation occurs on tyrosine residues, 
with much reduced levels of phosphoserine and almost 
undetectable levels of threonine phosphorylation [Fig. 10]. 

Considering that within the cell greater than 99% of 
total protein phosphorylation occurs solely on serine and 
threonine residues, the almost exclusive tyrosine 
phosphorylation ■ of the MUCl/Y protein is especially 
striking. Phosphorylated tyrosine residues play a pivotal 
role in signal transduction pathways [M.J. Pazin and L.T. 
Williams, ibid. {1992)] as, for example, those initiated by 
growth factor receptors such as epidermal growth factor 
receptor (EGF-RJ, platelet derived growth factor receptor 
(PDGF-E), colony stimulating factor-1 receptor (CSFl-R), 
etc. This suggests therefore, that the extensively tyrosine 
■phosphorylated MUCl/Y protein may also be performing an 
important signal-transducing function. 



MUCl/Y Protein Interaction With SH2 Domain Proteins 

Analysis of the MUCl proteins demonstrates the 
following features: 

1) biased localization of tyrosine residues in the 
cytoplasmic domain and setjuences N-terminal to it [Fig. 13]? 

2) all tyrosine residues within the polymorphic MUCl 
proteins are retained in the naci/X, MOCl/Y and 
MUCl/V proteins [Fig. 13]; 
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3j extensive siinilarity between the human and mouse MUCl 
proteins within the amino acid MUCl cytoplasmic domain 
[Fig. Ill; and 

4) marked similarity between tyrosine-containing sequences 
located within the Muci cytoplasmic domain and 
phosphotyrosine-containing peptide sequences that are 
recognized by SH2 domain-coiitaining proteins [Fig. 113. 

Bearing in mind that the MDCl/X, JTOCl/Y and MUCl/V 
proteins are extensively phosphorylated on tyrosine 
residues, these remarkable features indicate that the 
MUCl/X, MUCl/Y and MUCl/V proteins act as receptor-like 
molecules that participate in signal transduction. Thus, it 
is now believed that the cytoplasmic domain of the MUCl/X, 
MUCl/Y and MUCl/V proteins acts as a "surrogate" kinase 
Insert, in. a way similar to CD19 [D.A. Tuveaon, et al. , 
"CDI9 of B Cells as a Surrogate Kinase Insert Region to Bind 
Phosphatidylinositol 3 -Kinase," Science, Vol. 260, pp. 
986-988 (1993)], and undergoes transphosphorylation on 
tyrosine residues by other activated tyrosine kinases with 
which it may specifically interact. This then forms a 
signalling complex conipaaed of the phosphorylated MUCl/X, 
MUCl/Y and MUCl/V proteins and SH2 domain-containing 
proteins [C.A. Koch, et al., "SH2 and SH3 Domains: Elements 
that Control Interactions of Cytoplasmic Signalling 
Proteins," Science, Vol. 252, pp. 668-574 (1991)], thereby 
initiating signal transduction. 

To test whether the cytoplasmic domain of the MUCl/Y 
protein has the potential to interact specifically with SH2 
domain-containing proteins, reconjbinant MUCI cytoplasmic 
domain was synthesized and radioactively phosphorylated on 
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its tyrosine residues with highly purified epidermal growth 
factor receptor tEGF-R) . Incubatioa of the phosphorylated 
■MUCl cytoplasmic domain with either glutathione transferase 
(GST) alone, or with Growth Factor Receptor Binding 
Protein 2 IE. J. Lowenstein, et al., "The SH2 and SH3 
Domain- Containing Protein GEB2 Links Receptor Tyrosine 
Kinases to Ras Signalling," Cell , Vol. 70, pp. 431-442 
(1992)]/GST (GEB-2/GST) fusion protein bound to agarose 
beads, demonstrated marked binding to the GRB-2 protein 
[Fig. IIB]. Analysis of the Muci cytoplasmic domain amino 
acid sequence [Fig. IIA and Fig. 14] indicates that it may 
also interact with additional SH2 domain-containing 
proteins . 

Further experimentation demonstrated that purified, 
recombinant MUCl cytoplasmic domain protein that had been 
phosphorylated on its tyrosine residues specifically bound 
to the SRC SH2 domain and to the £H2 domain derived from the 
N-terminal part of the phospholipase c gamma 1 protein 
[Fig. lie, lanes 1 and 31. Under identical conditions, no 
binding was observed to the c-tefminal pS5 
phospha tidy linos itol (PI) 3' kinase SH2 domain. 

To validate in the in- vivo situation, findings that, 
demonstrate in-vitro interactions of the MUCl/Y protein with 
multiple SH2 domain-containing proteins and, in particular, 
with the GRB-2 protein, human breast cancer tissue cell 
lysates were prepared and incubated with either GST 
(glutathione transferase) beads alone, or with GST/GEB-2 
fusion protein beads. Bound' proteins were analyzed by SDS 
gel electrophoresis, transferred and subjected to probing 
with anti-MUCl cytoplasmic domain antibodies. The MUCl/Y 
protein was detected only in the sample that had been 
incubated with the . 6ST/GRB-2 fusion protein beads ^ 
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indicating that in the in-vivo situation the MUCl/Y protein 
potentially interacts with grb-2 protein. 

MUCl/X, MUCl/Y and MUCl/V Protein Expression Alters Cell 
Morpholocry and Increases Tumoriqenic Potential 

As the GaB-2 protein plays a key role in connecting 
tyrosine kinase receptors with the ras signal transduction 
system [E.J. Lowenstein, et al., ibid. (1992)], and as shown 
above, the MUCl/Y proteins contact the GRB-2 protein, the 
effect of MUCl/y protein expression on the morphology of ras 
transformed 3T3 fibroblasts was investigated. Transfectants 
were generated from ras transformed 3T3 fibroblasts with the 
neomycin resistance gene alone, and in combination with an 
expression vector harboring cDNA coding, for- either the 
MUCl/Y proteins or the large tandem repeat array containing 
MUCl protein. *he parental ras transformed 3T3 f ihrcjblasts, 
and control cells transfected only with the neomycin 
resistance gene, grew mostly in foci and cell clusters. As 
previously reported, transfectants expressing the large 
tandem repeat array cantaining MUCl protein displayed 
decreased cellular aggregation and did not grow in foci; 
this is likely due to the known anti-adhesive properties of 
the tandem repeat array containing EflUCl protein. The effect 
of MUCl/Y protein expression on cell morphology was, 
however, immediately apparent. These transfectants 
displayed a marked increase in the number of foci, an 
altered phenotype that was observed in all independent 
MUCl/Y protein-expressing transfectants analyzed. This is 
indicative of the fact that expression of the MUCl/Y protein . 
is indeed potentiating the transforming potential of the 
cell. 



Next, tests were conducted to determine whether MUCl/Y 
protein eicpression alters the tiunorigenic potential of 
mammary epithelial cells. Transfectants were generated 
using the DA3 naouse mammary epithelial cell line, derived 
front a DMBA-induced mouse mammary carcinoma, and esqiression 
of the MUCl/Y protein in the transfeotants was assessed by 
Western blotting. Positive MOCl/Y transfeotants, as well as 
tandem repeat array containing KUCl transf ectants and 
control neomycin transfeotants, were injected 
intramuscularly into female Balb/c mice at three different 
cell concentrations (3.10*, 10= and 5.10=) and the mice were 
monitored for tumor development. 

Mice injected with transf ectants expressing the tandem 
repeat array containing MUCl protein, or with the control 
neomycin transfeotants , showed similar patterns of tumor 
development. In marked contrast however, tumors developed 
rapidly in the MUCl/Y transfectant group and preceded the 
appearance of tumors in the other two groups by weeks to 
months, at all cell concentrations tested- For example, 
tumors developed in all mice (5 per group) injected with the 
MUCl/Y transfectant (5.10= cells per mouse) only 7 days 
following injection. Animals injected with the control 
neomycin transfeotants showed tumor development in three out 
of five mice that were first observed 6 weeks following 
injection- This pattern of increased tumorigenicity of the 
MUCl/Y transfeotants was consistently observed at all other 
cell concentrations tested. 

The experimental worlc described above demonstrates that 
the MUCl/Y proteins are highly expressed in hiunan breast 
cancer tissue; are extensively phosphorylated on tyrosine 
residues; interact specifically with the SRC homology domain 



wo 96/03502 



PCTflD9S/00627 



- 40 - 



(SH2) containing proteins GRB-2» SRC and phoapholipase C 
gairma-l; and increase cellular tumorigenic potential. 

As is seen from the structure of the MUCl/X molecule, 
it is highly similar to the KTUCl/Y molecule, except for the 
insertion of 18 amino acids between amino acid residue 
numbers 53 and 54 in the MUCl/Y sequence. The MUCl/X 
protein is therefore believed to function as a receptor 
molecule in a similar fashion to the MUCl/Y protein, 
although its affinity for ligand may differ. This is also 
true for the /alt configurations of MUCl/Y and MUCl/X- 

As is seen from the structure of the ITOCl/V molecule, 
it is highly similar to the MUCl/Y molecule. The MUCl/V 
protein is therefore believed to function as a receptor 
molecule in a similar fashion to the MUCl/Y protein, 
although its affinity for ligand may differ. This is also 
true for .the /alt configurations of MUCl/y, KUCl/X and 
MUCl/V. 

Taken together, the above data indicate that the 
HUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V and 
MUCl/V/alt proteins act as signal-transducing receptor -like 
-molecules that form a signalling complex which is intimately 
related to the oncogenetic process. 

The MUCl/X, MUCl/Y and MUCl/V proteins are, however, 
different from classical receptor tyrosine kinases, in that 
they do not contain a catalytical tyrosine kinase domain. 
One of the postulates of the present hypothesis is that the 
cytoplasmic domains of the MUCl/X, MUCl/Y and MUCl/V 
proteins undergo transphosphorylation in a manner similar to 
that recently described far the B cell CD19 molecule [Xt.A. 
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TuvBson, at al., ibid. (1993)] and for other cytokine 
receptors. 

Having identified the IHICI/X, MUCl/Y and MUCl/V 
receptors, it is now possible to prepare fimctional 
derivatives thereof, including purified receptors in soluble 

Thus, s.g. by deleting sequences dovmstraam from 
glycine araino acid numtier 173 in the MUCl/X sequence 
[Fig- 5A} or glycine amino acid nuitiber 155 in the MUCl/Y 
sequence [Fig. 6Al, or glycine amino acid nvmiber 140 in the 
MUCl/V sequence [Fig. 6C] , one produces truncated forms of 
the one produces truncated forms of the membrane receptors, 
which lack transmembrane and intracytoplasmic domains, but 
retain the ligand-binding extracellular portion. The 
affinities of soluble receptors for their ligands are 
comparable to those of the membrane receptors, and thus said 
soluble receptors can compete with the membrane bound 
receptors and inhibit binding of ligands to the cell and the 
resulting activation thereof. 

Furthejrmore , with the molecular characterization of the 
MUCl/X, MUCl/Y and MUCl/V receptor molecules described 
herein, one can design drugs that will specifically interact 
with these receptors. These drugs may then be used to 
target breast cancer cells, either for imaging or 
therapeutic purposes. 

Additionally, as receptor molecules are known to be 
shed off from cells into the peripheral circulation, assays 
employing antibodies directed against the MUCl/X, MUCl/Y and 
MUCl/V receptors can be developed to analyse the serum 
levels of these receptors. The serum concentrations of 
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these proteins, which, as previously described, are 
exprssaed at high levels in breast cancer cells, may provide 
a means for diagnosing individiials with early breast cancer 
and/or for monitoring the progression of breast cancer in 

patients who have already been diagnosed. 

Based on the teachings o£ the present invention, these 
and other uses of the soluble receptors of the present 
invention will be clear to persons skilled in the art, and 
this especially in the light of the description and use of 
other soluble receptors in the literature [see, e.g., 
R. Fernandez -Botr an. The FASEB Journal . Vol. 5, pp. 2567- 
2574 (1991) and S. Chamow, Int. J. Cancer , Supplement 7, 
pp. 69-72 (1992)]. 



Ligands 

Receptor molecules, such as the MUCl/X, MUCl/Y and 
MUCl/V proteins, specifically bind ligands. The MUCl/Z 
protein is secreted from the cell [Figs. 3 and 4] and, as 
detailed below, functions as aligand for the MUCl/X, MUCl/Y 
and MUCl/V receptor proteins. The MUCl/W protein is 
believed to have a similar ligand function, based on its 
structure. This is also true for the /alt configurations of 
MUCl/Z and MUCl/W. 

By using antibodies generated in rabbits directed 
. against Mucl/Z, we have unequivocally showed that the MUCl/Z 
protein is synthesized in breast tumor tissue , but not by 
normal breast tissue, and that it migrates in 
SDS-polyacrylamide gels with an apparent molecular mass of 
approximately 25 kDa. Binding of the 25 kDa protein to 
anti -MUCl/Z antibodies could be specifically competed out by 
•the addition of bacterial recombinant MUCl/Z protein. 
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thereby confirming the identity of the 25 kDa protein as the 
MUCl/2 protein. 

Investigation of the amino acid sequence of the MUCl/Z 
protein revealed several interesting featiires. 

First, as the MUCl/2 protein contains a signal 
sequence, but does not harbour a transroeinbrane domain, it is 
expected to be secreted from the cell. 

Second, an outstanding feature of the MUCl/z protein is 
the tryptophan- tryptophan (VJWJ sequence, localized just 
proximal to the c- terminal part of the protein [amino acid 
numbers 93 and 94 in the t4UCl/Z sequence {Fig- 8A) and amino 
acid numbers 102 and 103 in the MUCl/Z/alt sequence (Fig. 
8B)]. Ihis is unusual in that tryptophan is the least 
frequently occurring amino acid in proteins. A computer 
search for other proteins containing WW sequences revealed 
that the cell surface receptor for calcitonin contains the 
sequence GQRLWWYH, which is, strikingly, aljnost 
identical to the MUCl/Z sequence gqDLWWYN [amino acid 
numbers 89 to 96, Fig. 8A] . Such an occurrence of amino 
acid identity would occur at a probability of less than I in 
64 million. This suggests, therefore, that the MUCl/Z 
protein is in some way involved with cell surface receptor 
•interactions , 

Third, the MUCl/Z protein sequence contains several 
features that are found in other known ligands. For 
example, human epidermal growth factor (EGF) contains the 
sequence D L K W W and a similar sequence, D L W W appears 
in the MUCl/2 protein. Significantly, the location of this 
sequence is in both proteins identical, and occurs just 
.proximal to the carboxyl-terminus of the protein. 
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Fourth, a highly-conserved seqaence, consisting of 
CXCXXXXXG and which occurs in all growth factor 
ligand members, appears in the Mlicl/Z protein [amino acid 
numbers 70 to 7S, Fig. 8A] . 

Fifth, the MUCl/Z protein also contains several peptide 
sequences which are found in members of the prolactin/growth 
hormone family, such as prolactin, proliferin, and growth 
hormone. 

Taken together, the above considerations all support 
the present finding that the MUCl/Z protein acts as a ligand 
for the MUCl/y receptor protein. 

The following experiments further support the above 
contention. The extracellular doinain of the MUCl/Y receptor 
protein was synthesised as a recombinant bacterial protein 
and then purified and radioactively labelled, and then was 
used to probe Western blots containing proteins found in 
breast tumor tissue lysates. The labelled MUCl/Y receptor 
protein specifically bound to a 25 kDa protein that 
comigrated with the MUCl/Z protein; this protein was present 
in breast tumor tissue lysates, yet was absent in 
normal breast tissue. Furthermore, in different cell 
types and tissues, the levels of the MUCX/Z protein 
directly correlated with the levels of the 25 kDa protein 
that binds the MUCl/Y receptor protein. 

The MUCl/Z protein is therefore the ligand of the 
MUCl/X, MUCl/Y and MUCl/v receptor proteins. This is true 
also for UUCl/Z/alt. 

MUCl/W and HUCl/W/alt also contain a signal sequence 
and do not have a transmembrane domain. They are thus 
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secreted frcan the cell and, based on their structure, 
function as ligands in a similar fashion to the KHJCl/Z and 
MOCl/Z/alt proteins. 

in the method of the present invention, the new HUCl 
proteins described and claimed herein can be administered in 
various ways. It should be noted that these new MUCl 
proteins can be administered alone, or in combination with 
pharmaceutically acceptable carriers. compositions 
according to the present invention can be administered 
orally or parenterally, including intravenous, 
intraperitoneal, intranasal and subcutaneous administration. 
Implants of the compounds are also useful. The patient 
being treated is a warm-blooded animal, and in particular, 
mammals including man. 

The proteins of the present invention are administered 
in combination with other drugs, or singly, consistent with 
good medical practice. The composition is administered and 
dosed in accordance with good medical practice, taking into 
account the clinical condition of the individual patient, 

. the site and method of administration, scheduling of 
administration, and other factors known to medical 
practitioners. The "effective amount" for purposes herein 
is thus determined by such considerations as are known in 

. the art. 

When administering the new MUCl proteins parenterally, 
the pharmaceutical formulations suitable for injection 
include sterile aqueous solutions or dispersions and sterile 
powders for reconstitution into sterile injectable solutions 
or dispersions. The carrier can be a solvent or dispersing 
medium containing, for example, water, ethanol, polyol (for 
example, glycerol, propylene glycol, liquid polyethylene 
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glycol, and the like), suitable mixtiires thereof, and 
vegetable oils. 

Proper fluidity can be maintained, for example, by the 
use of a coating such as lecithin, by the maintenance of the 
required particle size in the case of dispersion, and by the 
use of surfactants. Non-aqueous vehicles such as cottonseed 
oil, sesame oil, olive oil, soybean oil, corn oil, sunflower 
oil, or peanut oil and esters, such as isopropyl myristate, 
may also be used as solvent systems for compound 
compositions. Additionally, various additives which enhance 
the stability, sterility, and isotonicity of the 
compositions^ including antimicrobial preservatives, anti- 
oxidants, chelating agents, and buffers, can be added. 
Prevention of the action of microorganisms can be ensured by 
various antibacterial and antifungal agents, for example, 
parabens, chlorobutanol, phenol, sorbic acid, and the like. 
In many cases, it will be desirable to include isotonic 
agents, for example, sugars, sodium chloride, and the like. 
Prolonged absorption of the injectable pharmaceutical form 
can be brought about by the use of agents delaying 
•absorption, for example, aluminum raonostearate and gelatin. 
According to the present invention, however, any vehicle, 
diluent or additive used would have to be compatible with 
the compounds. 

Sterile injectable solutions can be prepared by 
incorporating the proteins utilized in practicing the 
present invention in the required amount of the appropriate 
solvent with various of the other ingredients, as desired. 

A pharmacological formulation of the new MUCl proteins 
described and claimed herein can be administered to the 
patient in an injectable formulation containing any 
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compatible carrier, such as various vehicle, adjuvants, 
additives, and diluents; or the compounds utilized in the 
present invention can be administered patenterally to the 
patient in the form of slow-release subcutaneous implants or 
targeted delivery systems, such as polymer matrices, 
liposomes, and microspheres. An implant suitable for use in 
the present invention can take the form of a pellet which 
slowly dissolves after being implanted, or a biocompatible 
delivery module well-knovm to those skilled in the art. 
Such well-known dosage forms and modules are designed such 
that the active ingredients ' are slowly released over a 
period of several days to several weeks. 

Examples of well-known implants and modules useful in 
the present invention include: U.S. Patent ,No. 4,487,603, 
which discloses an implantable micro-infusion pump for 
dispensing medication at a controlled rate; U.S. Patent 
Ho. 4,486,194, which discloses a therapeutic device for 
administering raedicants through the skin; U.S. Patent No. 
4,447,233, which discloses a medication infusion pump for 
delivering medication at a precise infusion rate; U.S. 
Patent No. 4,447,224, which discloses a variable flow, 
implantable infusion apparatus for continuous drug delivery; 
U.S. Patent Ko. 4,439,196, which discloses an osmotic drug 
delivery system having multi-chamber compartments; and U.S. 
Patent No, 4,475,196, which discloses an osmotic drug 
delivery system. These patents are incorporated herein by 
reference. Many other such implants, delivery systems, and 
modules are well-known to those skilled in the art. 

A pharmacological formulation of the new MUCl proteins 
utilised in the present invention can be administered orally 
to the patient. Conventional methods such as administering 
the compounds in tablets, suspensions, solutions, emulsions. 
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capsules, powders, syrups and tlie like, are usable. Known 
techniques which deliver the new MUCl proteins orally or 
intravenously and retain the biological activity, are 
preferred. 

In one erobodiment, the new MUCl proteins can be 
administered initially by intravenous injection to bring 
blood levels of the new MUCl proteins to a suitable level . 
The patient's MUCl protein levels are then maintained by an 
oral dosage form, although other forms of administration, 
dependent upon the patient's condition and as indicated 
above, can be used. The quantity of the new MUCl proteins 
to be administered will vary for the patient being treated, 
and will vary from about lOO ng/kg of body weight to 
100 mg/Tcg of body weight per day, and preferably will be 
from 10 Jig/kg to 10 mg/kg per day- 



SXAMPI^ 1 

Irmitunoaasays for Detecting and Quantitatinq the New MUCl 
Proteins in Body Fluids 

To detect and guantitate the new MUCl proteins in body 
fluids such as, for example, serum, one of the most useful 
methods is the two-antibody sandwich assay [see E. Harlow 
and D. Lane, ibid., Chapter 14, "Immunoassays," pp. 553-612 
(1988)]. 

Both polyclonal and monoclonal antibodies are prepared 
against the new MUCl proteins. To use the two-antibody 
assay, one antibody is purified and bound to a solid phase, 
and one of the new MUCl proteins which is to be .assayed is 
allowed to bind. Unboiind proteins are removed by washing 
and the labelled second antibody is allowed to hind to the 



WO96W3502 



PCr/lB9»00(i27 



- 49 - 

antigen. After washing , the assay is guantitated by 
measuring the amount of labelled second antibody that is 
bound to the matrix and a calibration curve is established 
for the specific new MUCl protein which was assayed. 

To assay for the presence of the new MUCl proteins in 
body fluids, the above assay is repeated, using as test 
antigen a sample of the body fluid. 

EXAMPLE 2 

immuaohistochemical Staining for the Detection of the New 
MtTCl Proteins in Tissue Sections 

Histological studies for the detection of the new MlICl 
proteins are carried out on paraf orroaldehyde-f ixed, 
paraffin -embedded tissue samples. 

The cells or tissues are fixed to the glass slides and 
permeabiliEed using standard procedures as described in E. 
Barlow and D. Lane, ibid.. Chapter 10, "Cell Staining," pp. 
359-420 (1988). The antibodies against one of the new MUCl 
proteins are then added to the fixed and permeabilized cells 
.or tissues. As in many other immuno-cheraical techniques, 
the antibodies can be labelled directly either with an 
ensyme , f luorochrojne , etc . ^ or detected by using a labelled 
secondary reagent that binds specifically to the primary 
antibody . 
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In-Vxvo Iroaaina of Breast cancer Cells with, i-abelled Liaands 
that Bind to the New MnCl Receptor Proteins 

The MUCl/Z, MUCl/Z/alt, MUCl/W and MUCl/W/alt ligand 
proteins are used to target and thereby image breast cancer 
cells in the living body- These ligand molecules are 
radioactiveiy labelled with, far example, radioactive iodine 
(=^=^=1) using, for example, the Bol ton-Hunter reagent P'^'X- 
labelled N-succinimidyl 3 - ( 4 -hydroxy-pheny Ipropionate } J . 

An 0.5-1 mg/ml solution of the new MUC12 ligand 
proteins is prepared In 0-1 M sodium borate fpH 8.5) and 
transferred to ice. Approximately 500 microcurie of Bolton- 
Hunter reagent is transferred to a 1.5 ml conical tube at 
O'C and the reagent is dried in a stream of dry nitrogen 
gas. About 10 microliters of the protein solution is added 
to the dry Bolton-Hunter reagent, mixed gently and returned 
to the ice. Following incubation on ice for 15 minutes, a 
stop solution consisting of 100 microliters of 0.5. M 
ethanolamine, 10% glycerol, 0.1% xylene cyanol, 0.1 H sodium 
borate (pH 8.5) is added and incubated for 5 rain, at room 
temperature. The radioactiveiy iodinated MUCl/Z, 
MUCl/Z/alt, MUCl/W and MtrCl/W/alt ligand proteins are then 
separated from the iodinated Bolton-Hunter reagent on a 
gel- filtration column. 

To image breast . cancer cells in vivo, the labelled 
ligand molecules are injected intravenously into the 
patient, and the distribution of the radioactiveiy labelled 
molecules is monitored using radioactive imaging devices. 
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EXflHPLB 4 

Li g a iid as a Pruq Delxverv System for Ligand-Toxin Coniug-atea 
The MUCl/2, l5UCl/2/alt, MUCl/W and MUCl/W/alt ligand 
proteins are conjugated to cytotoxic substances and thereby 
used as drug delivery systems to target and kill breast 
cancer cells within the body. Several cytotoxic substances 
for conjugation may be used, including cytotoxic proteins 
such as pseudomas exotoxin A and ricin [I. Pastan and D. 
Fitzgerald, "RecQEt±>inant Toxins for Cancer Treatment 
Science, Vol. 254, pp. 1173-1177 C1991) 1 or cytotoxic levels 
of radioactivity. 

Conjugation of the new MUCl proteins to cytotoxic 
proteins is performed by any of a ntmsber of coupling 
procedures, including glutaraldehyde coupling and periodate 
coupling. 

In tiie two-step glutaraldehyde method, glutaraldehyde 
is first coupled to the pure cytotoxic protein via the 
reactive amino groups available on the protein. The 
cytotoxic protein-glutaraldehyde mix is then purified and 
added to the MUCl/Z, MUCl/Z/alt, KUCl/W, and MUCl/W/alt 
ligand proteins. Unconjugated material is then separated 
from the cytotoxic protein/new MUCl protein conjugate. 

The cytotoxic protein is dissolved in 0.2 ml of 1,25% 
glutaraldehyde (electron microscopic grade} in 100 mM sodium 
phosphate (pH 6.8). After 18 hours at room temperature, 
excess free gluaraldehyde is removed by gel filtration on a 
gel matrix that is pre-equilibrated with 0.15 M NaCl. The 
peak fractions containing the glutaraldehyde- linked 
cytotoxic protein are concentrated by ultrafiltration or by 
dialysis against 100 mM sodium carbonate-sodium bicarbonate 
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buffer (pH 9.5) containing 30% sucrose. The new DiRJci ligand 
proteins dissolved in 0.1 ml of 0,15 M NaCl are added to the 
cytotoxic protein solution, the pH is kept above 9.0, and 
the. loixtTire is incubated at 4*0 for 24 hours. At this 
stage, 0.1 ml of 0.2 M ethanolamine (pH 7.0) is added and 
the mixture incubated for a fxurther 2 hours at 4'C, 

The cytotoxic protein-new MUCl ligahd conjugate is then 
separated from the unconjugated protein molecules by either 
gel filtration or gel electrophoresis. 

For periodate coupling, the new MUCl ligand proteins 
are resuspended in 1.2 ml of water and freshly-prepared 
0.1 M sodium periodate (0.3 ml) in 10 mM sodium phosphate 
buffer (pH 7.0) is added. The mixture 'is incubated at room 
temperature for 20 minutes and then dialysed against 1 mM 
sodium acetate (pH 4.0) at 4*c with several changes 
overnight. A 0.5 ml solution (10 mg/ml) of the cytotoxic 
protein (for example, ricin) is prepared in 20 mM sodium 
carbonate buffer (pH 9.5) and added to the solution of the 
periodate treated new MUCl ligand proteins. The mixture is 
incubated at room temperature for 2 hours. The Schiff's 
bases that have formed are then reduced by adding 100 
microliters of sodium borohydride (4 mg/ml) in Water and 
incubating at 4''C for 2 hours. 

The cytotoxic protein-new MUCl ligand conjugate is then 
separated from the unconjugated protein molecultes by either 
gel filtration or gel electrophoresis. 

cytotoxic protein-new MUCl ligand conjugates may also 
be prepared using recombinant DNA technology. In this 
method, recombinant bacteria are generated that synthesize 



WO9«03502 



PCT/IB9«fl)0627 



- 53 - 



fusion proteins consisting of the, cytotoxic protein fused to 
the new MUCl ligand proteins. 

It will be evident to tliose skilled in the art that the 
invention is not limited to the details of the foregoing 
illustrative einbodiinents and examples, and that the present 
invention may be embodied in other specific forms without 
departing from the essential attributes thereof, and it is 
therefore desired that the present embodiments be considered 
in all respects as illustrative and not restrictive, 
reference being made to the appended claims, rather than to 
the foregoing description, and all changes which come within 
the meaning and range of equivalency of the claims are 
therefore intended to be embraced therein. 
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15 CLAIHED IS: 

1. A biochemically pure MUCl protein, selected from 
the group consisting of MUCl/X, MOCl/X/alt, MUCl/Y, 
MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z 
and MUCl/Z/alt or a functional derivative thereof devoid of 
a tandem repeat array. 

2 . A MUCl protein according to claim 1 or a functional 
derivative thereof, comprising a partial amino acid 
sequence 

MTPGTQSPFFLLLLLTVLT[ATTAPKPAT1 
VVTGSGHASSTPGGEKETSATQRS SVP 
S S T E K N A 

and devoid of a tandem repeat array downstreajti thereof. 

3. A MUCl protein according to claim I or a functional 
derivative thereof, comprising a partial amino acid 
sequence 

MTPGTQSPFFLLLL.LTVLTEATTAPKPAT] 
VVTGSGHASSTPGGEKETSATQRSSVPS 

and devoid of a tandem repeat array downstream thereof ► 

4. Biochemically pure MUCl/X and MUCl/X/alt, respectively 
comprising the sequences shovm in Figs. 5 A and 5B, or 
functional derivatives thereof. 
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5. Biochemically pure MUCiyY and MUCl/Y/alt, respectively 
comprising the sequences shown in Figs. 6 A and 6B, or 
functional derivatives thereof. 

6. Biochamically pure MUCl/V and MUCl/V/alt, respectively 

comprising the sequences shown in Figs. 6C and SD^ or 
functional derivatives thereof. 

7. Biochemically pure MUCl/w and MUGl/W/alt, respectively 
comprising the sequences shown in Figs. 7A and 7B, or 
functional derivatives thereof. 

S. Biochemically pure MtJCI/z and MUCl/Z/alt, respectively 
comprising the seguences shown in Figs. 8A and SB, or 
functional derivatives thereof. . 

9. A pharmaceutical composition, comprising as an active 
ingredient therein a biochemically pure MUCl protein 

selected from the group consisting of MUCl/X, MUCl/X/alt, 
MUCl/Y, MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, 
MUCl/Z, MUCl/Z/alt and functional derivatives thereof, in 
combination with a pharmaceutically acceptahle carrier. 
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10. A pharmaceutical composition for the treatment of 
human breast cancer, comprising as an active ingredient 
therein a biochemically pure HUCl protein selected from the 
group consisting of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/ait, 
MUCl/V, MUCl/V/alt, and functional derivatives thereof, in 
soluble form and in ctanbination with a pharmaceutically 
acceptable caurrier. 



11, A conjugated toxin for the treatment of human breast 
cancer, comprising a MUCl protein selected from the group 
consisting of MUCl/W, MUCl/W/alt, MUCl/2, MUCl/Z/alt and 
functional derivatives thereof, attached to a cytotoxic 
agent. 



12. A diagnostic agent for the detection of human breast 
cancer cells, comprising a detectably labelled, 
biochemically pure MUCl protein selected from the group 
consisting of MUCl/W, MUCl/W/alt, MUCl/Z, MUCl/Z/alt and 
functional derivatives thereof. 



13. A diagnostic agent for identification of sites in the 
body to which breast cancer cells have spread, comprising a 
detectably labelled MUCl protein selected from the group 
consisting of MUCl/W, MUCl/w/alt,. MUCl/Z, MUCl/2/ait and 
functional derivatives thereof. 
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14. A method for the treatment of hijinan breast cancer^ 
comprising administering to an individual having human 
breast cancer cells an amount of soluble MUCl/X, MUCl/X/alt, 
MUCl/Y, MUCl/Y/alt, MUCl/V, or MUCl/V/alt receptors, 
sufficient to Inhibit the binding of KtlCl ligands to said 
cells. 

15. A method for the treatment of human breast cancer, 
comprising administering to an individual having human 
breast cancer cells an amount of a ligand- toxin conjugant 
comprising a ligand selected from MUCl/W, MUCi/W/alt, MUCl/Z 
or MUCl/2/alt, fused to a cytotoxic toxin. 

16. A raiA sequence encoding the protein MUCl/X, comprising 
the nucleotide sequence substantially as shown in Fig. 5A or 
a functional derivative thereof devoid of a tandem repeat 
array. 

17. A raSA sequence encoding the protein MUCl/X/alt, 
comprising the nucleotide sequence substantially as shown in 
Fig. 5B or a functional derivative thereof devoid of a 
tandem repeat array. 

18. . A DNA sequence encoding the protein MUCl/Y, comprising 
the nucleotide sequence substantially as shown in Fig. 6 A or 
a functional derivative thereof . devoid of a tandem repeat 
array. 
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ls. A DNA sequence encoding the protein MTJCl/Y/alt, 
comprisitig the nucleotide sequence substantial!? as shown in 
Fig. 6B or a functional derivative thereof devoid 
of a tandem repeat array. 

20. A DNA sequencs encoding the protein MUCl/V, comprising 
the nucleotide sequence substantially as shown in Fig. 5C or 
a functional derivative thereof devoid of a tandem repeat 
array. 

21. A DiNA sequence encoding the protein MUCl/V/alt, 
comprising the nucleotide sequence substantially as shown in 
Fig. 6D or a functional derivative thereof devoid of a 
tandem repeat array. 



22. A DNA sequence encoding the protein MUCl/w, comprising 
the nucleotide sequence substantially as shown in Fig. 7 A or 
a functional derivative thereof devoid of a tandem repeat 
array . 

23. A DNA sequence encoding the protein KUCl/w/alt, 
. comprising the nucleotide sequence substantially as shown in 

Fig. 7B or a functional derivative thereof devoid of a 
tandem repeat array. 

24. A DNA sequence encoding the protein MUCl/Z, comprising 
the nucleotide sequence substantially as shown in Fig. 8 A or 
a functional derivative thereof devoid of a tandem repeat 
array. 
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25. A DNA sequence encoding the protein MUCl/Z/alt, 
comprising the nucleotide sequence substantially as shown in 
Pig. 8S or a functional derivative thereof devoid of a 
tandem repeat array. 

26. A DNA sequence according to any of claims 16-25, being 
a gDNA., 

27. An in-vitro bioassay for determining the presence of 
breast cancer cells in a sample, comprising contacting a 
tissue sample with a diagnostic agent, said agent comprising 
a detectable labelled MUCl protein selected from the group 
consisting of MUCl/W, iyrUCl/W/alt, MUCl/Z, MUCl/Z/alt and 
f^unctional derivatives thereof. 

28. An in-vitro bioassay for determining the presence of 
breast cancer cells in a sample, comprising: 

a) ■ isolating a specimen selected from the group 
consisting of tissue and cell biopsies , and 

h) assaying said specimen with antibodies selected 
from the group consisting of monoclonal and polyclonal 
antibodies that recognize a protein selected from the group 
consisting of M0C1/X, MUCl/X/alt, MUCl/Y, Macl/Y/alt, 
MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z, MUCl/Z/alt 
and functional derivatives thereof. 
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29. A DNA construct selected from the group consisting of 
cDNA coding for a biochemically pure MUCl protein selected 
from the group consisting of KUCl/X, MUCl/X/alt, MUCl/y, 
MUCl/Y/alt, MDfCl/V, MUCl/V/alt, MtJCl/W, MUCl/W/alt, MUCl/Z 
and MtTCl/Z/alt or a functional derivative thereof devoid of 
a tandem repeat array. 

30. The construct of claim 29, which is contained in a 
vector. 

31. A host cell transf acted with the construct of claim 30. 

32. A bioassay for screening substances for the ability to 
inhibit inaromary carcinoma, comprising: 

a) administering the substance to a cell transf ectant 
that expresses a protein selected from the group consisting 
of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl /Y/ alt, MUCl/V, 
MUCl/V/alt, and functional derivatives thereof; and 

b) determining whether such substance inhibits the 
growth of the cell transf ectant. 

33. A purified antibody which specifically binds a protein 
of claim 1. 

34. The antibody of claim 33, wherein said antibody is 
conjugated to a therapeutic drug. 
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35. The antibody of claim 33, wherein said antibody is 
conjugated to a detectable moiety. 

36. The antibody of claim 33, wherein said antibody is 
bound to a solid support. 

37 . A bioassay for determining the amaiint of a MUCl 
protein selected from the group consisting of MUCl/X^ 
MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, 
MOCX/W/alt, MUCl/a and MUCl/Z/alt or a functional derivative 
thereof devoid of a tandem repeat array in- a biological 
sample / comprising: 

a) contacting said biological sample with an antibody 
under conditions such that a specific complex of said 
antibody and said MUCl protein can be formed; and 

b) determining the amount of said antibody/MUCl 
protein complex, the amount of the complex indicating the 
amoujit of said MUCl protein in the biological sample - 

38. A method of detecting the presence of cancer in a 
subject, comprising determining the presence of a detectable 
amount of a MUCl protein selected from the group consisting 
of MTJCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V, 
MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z and MUCl/Z/alt or a 
functional derivative thereof devoid of a tandem repeat 
array in a biopsy from said subject, the presence of a 
detectable amount of said MUCl protein relative to the 
absence of said MUCl protein in a normal control indicating 
the presence of cancer. 
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39. A mathod of determining the prognosis of a subject 
having cancer, comprising determining the presence of a 
detectable amount o£ a Mtici protein selected from the group 
consisting of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, 
MUCl/V, MUCl/V/alt, Muei/W, MUCl/W/alt, MUCl/Z and 
MUCl/Z/alt or .a functional derivative thereof devoid of a 
tandem repeat array in a biopsy from in said subject, the 
presence of a detectable amount of said KUCl protein 
relative to the absence of said MUCl protein in a normal 
control indicating a decreased chance of long-term 
survival . 
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72 81 104 143 175 211 288345 390 
WTPGTQ.SPFFLLLLLTVLTVV 

T6 SGHASSTPGBEKEJSATQ 
375 463 373 310 27B260 114 159 113 51 101 93 IQ 106 82 60 47 129 53 



72 81 104 MS 175 217 306425 395 

MTPGTGSPFFLLLLLTVL T \A T\ 
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BEKETSATQ 
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30 . . 60 

ATGACACCGSSCACCCAerCTCCTTTCTTCCTGCTGCTGCTCCrCACAffTGCTTAMGTT 
MTPGTQSPFFLLLLLTVLTV^^ 

90 . . 120 

GTTACAGGTTCTGGTCATGCAAGCTCTACCCCAGGTGGAGAAAAGGAGACTTCGGCTACC 
VTGSGHASSTPGGcKETSATj^ 

150 . ■ 190 

CAGAGAAGTTCA6TGCCCAGCTCTACTGAGAAGAAT6CTTT6TCTACTGGGGTCTCTTTC 
flRSSVPSSTEKNALSTGVSF 

210 - ' 2^40 

TTrTTCCTGTCTTTTCACATTTCAAACCTCCAGTTTAATTCCTCTCTGGAAGATCCCAGC 
FFLSFHISHLQFNSSLtDPS^^ 

270 . . 300 

ACCGACTACTACCAAGAGCTGCAGAGAGACATTTCTGAAATGTTTTTGCAGATTTATAAA 
TDVYQELGBDISEMFLQIYK^^ 

330 . . 360 

CAAGGGGGTTTTCTGGGCCTCTCCAATATTAAGTTCAGGCCAGGATCTGTGGTSGTACAA 
■QGGFLGLS N IKFRPGSVVVQ^^ 

390 - • ^20 

TTGACTCTGGCCTTCCGAGAAGGTACCATCAATGTCCA^ 

450 . . 480 

CAGTATAAAACGGAAGCAGCCTCTCGATATAACCTGACGATCTCAGACGTCAGCGTGAG^ 
OYKTEAASflyNLTISDVSVS^^ 

510 • - 540 

GATGTGCCAtTTCCTTTCTCTGCCCAGTCTGGGGCTGGGGTGCCAGGCTGGGGCATCGCG 
DVPFP FSAQSGAGVPGWGIAg^ 

570 • ' BOO 

CTGCTGGTGCTGGTCTGTGTTCTGGTTGCGCTGGCCATTGTCTATCTCATTGCCTTGGCT 
LLVLVCVLVALAIVYLIALA^^ 

630 . ■ 560 

GTCTGTCAGtGCCGCCGAAAGAACTACGGGCAGCTGGACATCTTTCCAGCCCGGGATACC 
VCQCRRKNVGQLDIFPARDJ 

690 . ' 720 

■TACCATCCTATGAGCGAGTACCCCACCTACCACACCCAT6GGCGCTATGTGCCCCCTAGC 
y.HPMSEYPTYHTHGRYVPPS 

750 . • 780 

AGTACCGArtGJAGCCCmTGAGAAGGJTTCTGCA^^^ 

.STDRSPYFKVSAGNGGSSLSg^ 
810 

TACACAAACCCAGCAGTGGCA6CCACTTCT6CCAACTTGTAG 
Y TNPA VAATSANLU 



Fig-5A 
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on • i20 

ACCACAGCc60TA.«CCBjAA^^GTTGfTA« 

egTBgAeAAAAGSAGACTTCeeCTACcfAEAASTTgAfe^^^^^^ 

AATSCTTpfCTACTEgGejCTCTTrcffTTTCCTSTg™ 

TTTAATTCctcTCTGGAASATrcCA|cip6ACTACTACCA^^^^^^ 

TgTGAMTGmrTeCAGATTTATAAAfsS8G|TmCTGG|CCTCrCCAA^ 

Tr.AGGCCA&ATCT6TGSTSGTACAA??gACTCTGGCCTTCCG^ 

CTGACGATctCA6ACGTCAiCGTCA|TfsTGCCATTTCCTTTCTgreCCC« 
L T I S U » 3 180 

eCTSG6GTGCCAGpsesfeCATC6CsEpcTSG«CTeGTCTSTGTTCTSGTreC^ 

AG VPGWGIALLVL.V 

f^nn . . 660 

G.apTCfATCTCATTGCCTpG.TfTGTCAG™^^ 

CTGGACATCtTTCCAGCGC3GGATAc/T|BcATCCTA™^^^ 

ACCCATGGGCGCTATGTGCCC.CTAGcfe¥ACCGATCGTASCCC^^^^ 

SCA.|TAATG|TeeC.|C.i™^^ 

AACTTGTAG. ' Fig-5B 
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ATGACACCGGGCACCCAGJCTCCTTTCTT^^ 

90 - . 120 

GTTAC4GeTTCTGSTC;iTGC/^A6CTCTACCCCAGGTSGj^GA>lAAGeaGACTTCSGCTACC 
V TGS GHASSTPGGtKtiJbA i^^ 

150 . ■ 180 

CAGAGAAGTTCAGTGCCCAGCTCTACTGAGAAGAATGCTTmATTCCTCTCTGGAAGAT 
QRSSVPSSTEKNAFNSSLfcOg^ 

210 . ■ • 240 

CCCAGCACCGACTACTACCAAGAGCTGCAGAGAGACATTTCTGAAATGTTTTTGCAGATT 
PSTDVYQELQRDIScMFLQIg^ 

270 . • 300 

TATAAACAAGGGGGTTnCTGGGCCTCTCCAATATTAAGTpGGCCAGGATCTGTGGT^ 
VKGGGFLGLSNIK FRPGSVV^^ 

330 . - 360 

GTACAATTGACTCTGGCCTTCCGAGAAGGTACCATCAATGTCCACGACGppCACAS 
VQLTLAFRtGTINVHDVtTQ^^ 

390 . • -^20 

TTCAATCAGtATAAAACGGAAGCAGCCTCTCGATATAACCTGACGATCTCAGACGTCAGC 
FNQYKTEAASflVNLTISDVS^^ 

450 . • -^30 

GrGAGTGATGTGCCATTTCCTTTCTCTGCCCAGTCTGGGGCTGGGGTGCCASGCTGGGGC 
VSDVPFPhSAQSGAGvPbW b^^ 

5iO ^ • 540 

ATCGCGCTGCTGGTGCTGGTCTGTGTTCTGGTTGCGCTGGCCATTGTCTATCTCATTGCC 
I-AL LVLVCVLVALAIVYLIA^^ 

570 . . SOO 

ITGGCTSrctGTCAGTGCCGCCGAAAGAACTACGGGCAGCTGGACATCTTTCCAGCCCGS 
LAVCQCRRKNYGOLDIrPAR^^ 

630 . ' 560 

GATACCTACCATCCTATGAGCGAGUCCCCACCTACCA^^^ 
DTYHPMSEYPTrHTHGRYVP^^ 

690 • ' 720 

CCTAGCAGTACCGATCGTAGCCCCTATGAGAAGGTTTCTGCAGpATGGTGGCAGCAGC 
PSSTDflSPYcKVSAGNGG SS^^ 

750 - 780 

CTCJCTTACACAAACCCAGCAGTGGCAGCCACTTCTGCCAACJTGTAG 
LSYTNPAVAAJSAHIU 



■ Fig-6A 

SUBSTITUTE SHEET (RULE 26) 



FCT/IB95/0D627 



7/15 

30 . .50 

ATGACACCGGGCACCCAGTCTCCTTTCTpC^^^ 

90 . • 120 

ACCACAGCCCCTAAACCCGCAACAGTTGTTACAeGTTCreGTCATGCAAGCTCTACCCCA 
JTAPK PATVVTGSGhlASSrP^^ 

150 . • ISO 

GGTGGAGAAAAGGAGACTTCGeCTACCCAGAGAAGTTCAGTGCCCAGCTCTACrGAGAAG 
G GEKETSATGflSSVPSSTcK^^ 

210 . • 240 

AATGCTTTTAATTCCrCTCTSGAAGATCCCAGCACCGACTACTACCAAGAGCrGCAGAGA 
NAFNSSLEDPSTDYYQtLQR^ 

. . 270 . . 300 

GACATTTCTGAAATGTTTTTGCAGATTTATAAACAAGGGGGTTTrCTGGGCCTCTCCAAT 
DISEMFLQIYKGGGFLGLSN^^ 

330 - . 360 

ApAAGTTCAGGCCAGGATCTGTGGTSGTACAATTCACTCTGGCCn^ 

330 . . 420 

ATCAATGTCGACGACGTGGAGACACAGTTCAATCAGTATAAAACGGAAGCAGCCTCTCGA 
INVHDVETGFNQVKTtAASR^^ 

450 . ' -^SO 

TATAACCTGACGATCrCAGACGTCAGCGTGAGTGATGTGCCATTrCCTrTCTCTGCCCAG 
Y NLTI SDVSVSD VPFPFSAQ^^ 

540 . • 540 

TCTGGGGCTGGGGrGCCAGGCTGGGGCATCGCeCTeCTGGTGCreGTCTGTGTTCTGGTT 
SGAGV'PGWGIA LLVL-VCVLV^^ 

570 . • SOO 

GCGCTGGCCATTGTCTATCTCATTGCCTTGGCT6TCTGTCAGTGCC6CCGAAAGAACTAC 
ALAIVYLlALAVCQCfl^^KNY^^ 

630 - . 660 

GGGCAGCTGGACATCrTTCCAGCCCGGGATACCTACCATCCTATGAGCGAGTACCCCACC 
■GQLDIFPARDTYHPMSEYPT^^ 

690 . . 720 

TACCACACCCATGGGCGCTATGTGCCCCCTAGCAGTACCGATCGTAeCCCCrATGAGAAG 
YKTHGRYVPPSSTDRSPYcK^^ 

750 . - 780 

GTTTCTGCAGeTAATGGTGGCAGCAGCCTCJCTTACACAAACCCAGCAGTGGCAGCCACT 
VSA GNGGSS. L.S V ^ ^ ^ ^ ^ ^ ^ ^qq 



rCJGCCAACTJGTAG 
S A H I U 
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ATGACACCBGGCACCCAG7C7CCTTTCJTCCTGCJGCJGC7CCTCACAGTGCTTACAGTT 
M TPGTQSPFFLLLLLTVLTV^^ 

90 . . 120 

GTTACAGGTTCTGGTCATGCAAGCTC7ACCCCAGGTGGAGAAAAGGAGACTTCGGCTACC 
VJGS 6 HA SSJPGGEKs^TSA 7 

40 

150 . . ISO 

CAGAGAAG7TCAG7GCCCAGCACC6ACTAC7ACCAAGAGC7GCAGAGAGACA777C7GAA 
GflSSVPSTDVVQELQRDISE 

bU 

210 . ' 240 

ATG77777GCAGA777ATAAACAAGGGGGT77TCTGGGCC7CTCCAATAITAAG71CAGG 
14FLQIYKQGGFLGLSNIK FR^^ 

270 . . 300 

CCAG6A7C7G7GG7G67ACAA7TGAC7C7GGCCT7CCGAGAAGGTACCA7CAA7GTCCAC 
PGSVVVQL7LAFREG7INVH 

100 

330 . . 360 

GACG7GGAGACACAG7TCAA7CAG7A7AAAACGGAAGCAGCCTC7CG^^ 
DVeTOFNOYKTEAASRVNLT 

120 

390 . . 420 

A7C7CAGACGTCAGC67GAG7GA7G7GCCA7T7CCT7TC7C7GCCCAG7C7GBGGC7GGG 
I5DVSVSDVPFPFSAQSGA G 

140 

450 . • 480 

G7GCCAGGC7GGGGCA7CGCGC7GC76G7GC7GG7C7GTG77C7GG77GCGC7GGCCA77 
VPG^GIALLVLVCVLVALAI^^ 

510 . ■ 540 

GTCTATCJCA77GCC77GGCTG7CTG7CAG7GCCGCCGAAAGAAC7ACGGGCAGC7GGAC 
VYLIALAVCQCBBKNYGQLD^^ 

570 . . 600 

A7CT7TCCAGCCCGGGA7ACC7ACCArCC7A7GAGC6AGTACCCCACC7ACCACACCCA7 
IFPA. RDTYHPMSEYPTYHTH^^ 

630 . . 660 

GGGCGCTATGTGCCCCCTAGCAGTACCGATCGTAGCCCCTATGAGAAGGJTTCTGCAGGT 
GPyVPPSS7DRSPY^KVSAG^^ 

590 - . 720 

>^/iT6e7SGCASCAGCCTCTCTTACACAAACCCA(5CAe7;GGCA6CCACTTCTGCCAACTTG 
NSGSSLSYT NPA V A A T S A N L^^ 

TAG 
U 
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30 . .60 

ATGACACCGGBCACCCAGTCJCCJTTCTTCCJGCTGCTGCTCCTCA^^ 
MTPGTQ SPFFLLLLLTV LTA^^ 

, go . ■ 120 

ACCACAeCCCCTAAACCCGCAACAeTTGTTACAGGTTCTeGTCATGW 

150 . ' iSO 

GGJGGAGAAAAGGAGACTTCGGCU^ 

210 . • 240 

CAAGAGCTGCAGAeAGACATTTCTGAAAIGTITTTGCAGApTATAAA 

270 . . 300 

CTGGGCCTCtcCAATATTAAGTTCAGeCCAGGATCTGTGGTGGTACAATTGACTCTGSCC 
LGLSNIKFRPGSVVVtJLTLA^^ 

330 . 360 

TTGCGAGAAGGTACCATCAATGTCCACGACGTGGAGACACAGTTCAATCAeTATAAAACG 
FREGTINVHDVETQFNQYKT^^ 

390 . - 420 

GAAGCAGCCfCTCGATATAA^^^ 

450 . . 480 

CCTTTCTCJkCCAGTCTGGGGCTGGGGTGCCAGGCTGGGI^AlCGCGCJGCTGGTGCTG 
PFSAQSGAGVPGWGIALLVLg^ 

5iO . - 5^0 

GTCTGTGTTCTGGTTGCGCTGGCCArTGTCTATCTCATTGCCTTGGCTGTCTGTCAGTGC 
VCV LVALAIVYLIALAVCQC^^ 

570 . . 600 

CGCCGAAAGAACTACGGGCAGCTGGACATCTITCCAGCCCGGGATACCTACCATCCTATG 
RHKNYGQLDIhPARDTYHPM^^ 

630 . • 660 

AGC6AGTACCCCACCTACCACACCCATGGGCGCTATGTGCCCCCTAGCAGTACCGATC6T 
StyPTYHTHbHYVHHbb / JJ hl^^ 

590 . • 720 

AGCCCCTATGAGAAGGTTTCTGCAGGrAATGGTGGCAGCAGCCTCTCTTACACAAACCCA 
SPYEKVSAGNGGSSLSYTNP 



s p y E 

750 

GCAGTGGCAGCCACTTCTGCCAACTTGTAG 
AV'AATSAWLU 



240 



Fig- 
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30 . .60 

ATGACACCGGBCACCCAGTCrcCJTlCrrCCTGCJGCTGCJCCTCAC^^^ 
MTPGTQSPFFLLLILTVLJV^^ 

■90 . ' i20 

GTr^CAGGTTCTeerCATGCAAeCTCTACCCCASGTGGAGAAAAGGAGACTTCGGCTACC 
VT GS GHA SS TP G GEKE TS A T^^ 

150 . • iSO 

CAGAGAAGTTCAGTGCCCAGCTCTACrGAGAAGAATGCTCACTTCTCCCCAGTTGTCTA^ 
QRSSVPSSTEKNAHFSPVVY 

bu 

210 

TGGGGrCTCTTTGTTTnTCTGTCTTTTCACATTTCAAACCTCC^ 
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30 - .60 

ATGACACCGSGCACCCAGTCTCCTTTCTTCCTGCTGCTGCTCCTCACAGTGCTTACAGCT 
M TP G TQ S PFP L L L L L TVL T A^^ 

SO . . i20 

ACCACA6CCCCTAAACCCGCAACAGTT6TTACAGGTTCTGSTCAT6CAAGCTCTACCCCA 
TTAPKPATVVTGSGHASSTP^^ 

150 . - 180 

GGTGGAGAAAAGGAGACTTCGGCTACCCAGAGAAm 

^ ^ 60 

210 . . 240 

AATeCTCACTTCTCCCCAGTrGTCTACTGGGGTCTGTTTCTTTTTCCTGTCTTTTCACAT 
NAHFSPVVY^GLFIFPVFSH 



TTCAAACCrcCAGTJJAA 
F K P P V U 
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30 . .60 

ATGACACCGGGCACCCAGTCJCCTTTCTTCCJGCTGCJGCrCCTC^^^^ 

go - . 120 

G7TACAGGTfcTGGTCA7GCAAGCrC7ACCCCAGGTGGAGAAAAGGAGACJTCGGCTACC 
VTS SGHASSTPG8cKnTSAT^^ 

150 . • 190 

CAGAGAAGITCAGIGCCCAGCTCTACTGAeAAGAATGCTATCCCAGCACCGACTACTACC 
QRSSVPSSTEKNAIPAPTTT^ 

210 - • 240 

AA6AGCTGCAGA6AGACATTTCTGAAAT6TTITTGCAGATTTATAAACAAGGGGGTTTTC 
KS CHETFLKCr CRt-lNKGVF^^ 

270 

TGeGCCTCTCCAATATTAAGTTCAGGCCAGGATCTGTGGTGGTACAATTGA 
WASPILSSGQDLWWYNU 
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30 . .60 

ATGACACCGGGCACCCAGTCTCCTTTCTTCCTGCTGCT6CTCCTCACAGTGCTTACAGCT 
MTPGTQSPFFLLLLLTVLTA^^ 

30 . .120 

ACCACAGCCCCTAAACCCeCAACAGTTGTTACAGGTTCTGGTCATGCAAGCTCTACCCCA 
TTAPKPATVVTGSGHASSTP^^ 

150 . . 160 

GGTGGAGAAAAGGAGACTTCGGCTACCCAGAGAAGTTCAGTGCCCAGCTCTACTGAGAAG 
G'GEK E TSA TG.fl SS VPSS TEK^^ 

210 . * 240 

AATGCTATCCCAGCACCGACTACTACCAAGAGCTGCAGAGAGACAmCTGAAATGTTTT 
NAIPAPTTTKSCflETFLKCF^^ 

270 . . 300 

TGCAGATTTATAAACAAGGGGGTTTTCTGGGCCTCTCCAAIATTAAGTTCAGGCCAGGAT 
CR FI NKGVFWA SP i LS.S6 QD^^ 

CTGTGerSGfACAATrGA 
L W W Y N U 



Fig-8B 
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RECOGNITION SPECIFICITIES OF SH2 DOMAINS 
SH2 DOMAIN pY+i pY^-g py+3 ML/Ci SEQUENCE 
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